Outer union corr что это

Обновлено: 19.05.2024

Может ли кто-нибудь проверить мое понимание транзакций proc sql union? Моя интерпретация различий между внешним союзом и объединением заключается в следующем:

Союз удаляет повторяющиеся строки, а внешний союз не

Союз будет накладывать столбцы, а внешний союз по умолчанию не будет.

Итак, будет ли какая-либо разница между объединением всех соответствующих и внешних объединений? Кажется, что "ВСЕ" удалили бы первую разницу, и "CORRESPONDING" удалит второе различие, но я думаю, что может быть дополнительная разница между двумя, которые я не вижу.

спросил(а) 2014-02-04T00:24:00+04:00 7 лет, 8 месяцев назад

Оказывается, на самом деле существует разница: как обрабатываются столбцы, которые существуют только в одном наборе данных. Outer Union Corresponding будут отображаться столбцы, которые отображаются только в одном наборе данных, а не накладываются положением. Union All Corresponding не отображает никаких столбцов, которые отображаются только в одном наборе данных.

ответил(а) 2014-04-17T20:39:00+04:00 7 лет, 5 месяцев назад

Я понимаю, что OUTER UNION и UNION ALL эффективны, если фактически не идентичны. CORR необходим для того, чтобы гарантировать выравнивание столбцов; с OUTER UNION столбцы не будут складываться, даже если они идентичны, а при UNION ALL столбцы всегда складываются, даже если они не идентичны (должны быть одинаковые типы данных или это будет ошибка) и вообще не обращают внимания на столбец имя. В обоих случаях добавление CORR приводит к их стеку.

Outer union corr что это

PROC SQL can combine the results of two or more queries in various ways by using the following set operators:

produces all unique rows from both queries.

produces rows that are part of the first query only.

produces rows that are common to both query results.

concatenates the query results.

The operator is used between the two queries, for example:

Place a semicolon after the last SELECT statement only. Set operators combine columns from two queries based on their position in the referenced tables without regard to the individual column names. Columns in the same relative position in the two queries must have the same data types. The column names of the tables in the first query become the column names of the output table. For information about using set operators with more than two query results, see the section about the SQL procedure in the Base SAS Procedures Guide . The following optional keywords give you more control over set operations:

does not suppress duplicate rows. When the keyword ALL is specified, PROC SQL does not make a second pass through the data to eliminate duplicate rows. Thus, using ALL is more efficient than not using it. ALL is not necessary with the OUTER UNION operator.

overlays columns that have the same name in both tables. When used with EXCEPT, INTERSECT, and UNION, CORR suppresses columns that are not in both tables.

Each set operator is described and used in an example based on the following two tables.

Whereas join operations combine tables horizontally, set operations combine tables vertically. Therefore, the set diagrams that are included in each section are displayed vertically.

The UNION operator combines two query results. It produces all the unique rows that result from both queries; that is, it returns a row if it occurs in the first table, the second, or both. UNION does not return duplicate rows. If a row occurs more than once, then only one occurrence is returned.

You can use the ALL keyword to request that duplicate rows remain in the output.

The EXCEPT operator returns rows that result from the first query but not from the second query. In this example, the row that contains the values 3 and three exists in the first query (table A) only and is returned by EXCEPT.

Note that the duplicated row in Table A containing the values 2 and two does not appear in the output. EXCEPT does not return duplicate rows that are unmatched by rows in the second query. Adding ALL keeps any duplicate rows that do not occur in the second query.

The INTERSECT operator returns rows from the first query that also occur in the second.

The output of an INTERSECT ALL operation contains the rows produced by the first query that are matched one-to-one with a row produced by the second query. In this example, the output of INTERSECT ALL is the same as INTERSECT.

The OUTER UNION operator concatenates the results of the queries. This example concatenates tables A and B.

Notice that OUTER UNION does not overlay columns from the two tables. To overlay columns in the same position, use the CORRESPONDING keyword.

There is no keyword in PROC SQL that returns unique rows from the first and second table, but not rows that occur in both. Here is one way you can simulate this operation:

This example shows how to use this operation.

The first EXCEPT returns one unique row from the first table (table A) only. The second EXCEPT returns one unique row from the second table (table B) only. The middle UNION combines the two results. Thus, this query returns the row from the first table that is not in the second table, as well as the row from the second table that is not in the first table.

Outer union corr что это

is one of the following:

INTERSECT <CORRESPONDING> <ALL>

OUTER UNION <CORRESPONDING>

UNION <CORRESPONDING> <ALL>

EXCEPT <CORRESPONDING> <ALL>

Query Expressions and Table Expressions

A query-expression is one or more table-expressions. Multiple table expressions are linked by set operators. The following figure illustrates the relationship between table-expressions and query-expressions.

PROC SQL provides these set operators:

concatenates the query results.

produces all unique rows from both queries.

produces rows that are part of the first query only.

produces rows that are common to both query results.

A query-expression with set operators is evaluated as follows.

Each table-expression is evaluated to produce an (internal) intermediate result table.

Each intermediate result table then becomes an operand linked with a set operator to form an expression, for example, A UNION B.

If the query-expression involves more than two table-expressions, then the result from the first two becomes an operand for the next set operator and operand, such as (A UNION B) EXCEPT C, ((A UNION B) EXCEPT C) INTERSECT D, and so on.

Evaluating a query-expression produces a single output table.

Set operators follow this order of precedence unless they are overridden by parentheses in the expressions: INTERSECT is evaluated first. OUTER UNION, UNION, and EXCEPT have the same level of precedence.

PROC SQL performs set operations even if the tables or views that are referred to in the table-expressions do not have the same number of columns. The reason for this behavior is that the ANSI Standard for SQL requires that tables or views that are involved in a set operation have the same number of columns and that the columns have matching data types. If a set operation is performed on a table or view that has fewer columns than the one or ones with which it is being linked, then PROC SQL extends the table or view with fewer columns by creating columns with missing values of the appropriate data type. This temporary alteration enables the set operation to be performed correctly.

The CORRESPONDING keyword is used only when a set operator is specified. CORR causes PROC SQL to match the columns in table-expressions by name and not by ordinal position. Columns that do not match by name are excluded from the result table, except for the OUTER UNION operator. See OUTER UNION.

For example, when performing a set operation on two table-expressions, PROC SQL matches the first specified column-name (listed in the SELECT clause) from one table-expression with the first specified column-name from the other. If CORR is omitted, then PROC SQL matches the columns by ordinal position.

The set operators automatically eliminate duplicate rows from their output tables. The optional ALL keyword preserves the duplicate rows, reduces the execution by one step, and thereby improves the query-expression's performance. You use it when you want to display all the rows resulting from the table-expressions, rather than just the unique rows. The ALL keyword is used only when a set operator is also specified.

Performing an OUTER UNION is very similar to performing the SAS DATA step with a SET statement. The OUTER UNION concatenates the intermediate results from the table-expressions. Thus, the result table for the query-expression contains all the rows produced by the first table-expression followed by all the rows produced by the second table-expression. Columns with the same name are in separate columns in the result table.

For example, the following query expression concatenates the ME1 and ME2 tables but does not overlay like-named columns. Outer Union of ME1 and ME2 Tables shows the result.

Concatenating tables with the OUTER UNION set operator is similar to performing a union join. See Union Joins for more information.

To overlay columns with the same name, use the CORRESPONDING keyword.

In the resulting concatenated table, notice the following:

OUTER UNION CORRESPONDING retains all nonmatching columns.

For columns with the same name, if a value is missing from the result of the first table-expression, then the value in that column from the second table-expression is inserted.

The ALL keyword is not used with OUTER UNION because this operator's default action is to include all rows in a result table. Thus, both rows from the table ME1 where IDnum is 1120 appear in the output.

The UNION operator produces a table that contains all the unique rows that result from both table-expressions. That is, the output table contains rows produced by the first table-expression, the second table-expression, or both.

Columns are appended by position in the tables, regardless of the column names. However, the data type of the corresponding columns must match or the union will not occur. PROC SQL issues a warning message and stops executing.

The names of the columns in the output table are the names of the columns from the first table-expression unless a column (such as an expression) has no name in the first table-expression. In such a case, the name of that column in the output table is the name of the respective column in the second table-expression.

In the following example, PROC SQL combines the two tables:

In the following example, ALL includes the duplicate row from ME1. In addition, ALL changes the sorting by specifying that PROC SQL make one pass only. Thus, the values from ME2 are simply appended to the values from ME1.

See Combining Two Tables for another example.

The EXCEPT operator produces (from the first table-expression) an output table that has unique rows that are not in the second table-expression. If the intermediate result from the first table-expression has at least one occurrence of a row that is not in the intermediate result of the second table-expression, then that row (from the first table-expression) is included in the result table.

In the following example, the IN_USA table contains flights to cities within and outside the USA. The OUT_USA table contains flights only to cities outside the USA.

This example returns only the rows from IN_USA that are not also in OUT_USA:

The INTERSECT operator produces an output table that has rows that are common to both tables. For example, using the IN_USA and OUT_USA tables shown above, the following example returns rows that are in both tables:

Как я могу избежать этой ошибки в SAS?

При попытке объединить наборы данных в SAS я постоянно получаю следующую ошибку для ряда переменных:

Столбец 115 от первого участника OUTER UNION не такой же, как его аналог второго

Обычно мне удалось обойти эту ошибку, выполнив следующие действия:

Изменение одной из переменных на один и тот же "тип" другого. Например, изменяя переменную A на тип символа из числового типа, чтобы она соответствовала переменной в другом наборе данных, тем самым позволяя слияние происходить.

Импорт наборов данных, которые я пытаюсь объединить вместе, как файлы CSV, а затем добавляя опцию "гадание строк" на этапе импорта proc. Например:

Однако иногда, несмотря на импорт моих файлов в виде CSV и использование "угадываний", я все еще получаю вышеуказанную ошибку, а иногда бывает так много, что ОЧЕНЬ много времени и нецелесообразно фактически преобразовывать все переменные в один и тот же "тип", они совпадают между наборами данных.

Может ли кто-нибудь посоветовать мне, как я могу легко ИЗБЕЖАТЬ эту ошибку? Есть ли другой способ, которым люди обходят это? Я получаю эту ошибку так часто, что мне надоело конвертировать каждую переменную. Должен быть другой путь!

Читайте также: