好得很程序员自学网

<tfoot draggable='sEl'></tfoot>

MYSQL Nested Join Optimization

* FROM t1 LEFT JOIN (t2, t3, t4) ON (t2.a = t1.a AND t3.b = t1.b AND t4.c = t1.c)

等于:

 SELECT   *   FROM  t1  LEFT   JOIN  (t2  CROSS   JOIN  t3  CROSS   JOIN   t4)
                   ON  (t2.a = t1.a  AND  t3.b = t1.b  AND  t4.c = t1.c)

   在mysql中,cross join 语法上等效与 inner join(他们可以互相替换),在标准sql中,他们并不等效,

    一般情况下,在只包含inner join操作的join表达式中,圆括号可以被忽略,但外连接时,省略圆括号会带来不一样的结果 :

t1  LEFT   JOIN  (t2  LEFT   JOIN  t3  ON  t2.b = t3.b  OR  t2.b  IS   NULL  )
     ON  t1.a = t2.a

  转变:

(t1  LEFT   JOIN  t2  ON  t1.a = t2.a)  LEFT   JOIN   t3
      ON  t2.b = t3.b  OR  t2.b  IS   NULL 

   然而,这两个表达式并不等效,假设:

 1: 表t1包含 行(1),(2);
2: 表t2包含行(1,101);
3: 表t3包含行(101);


在这种情况下,第一个表达式返回的结果集包括(1,1,101,101) ,(2,null,null,null),然而第二表达式返回(1,1,101,101),(2,null,null,101);

mysql >   select   *   from  
     ->     t1  left   join  
     ->       (t2  left   join  t3  on  t2.b  =  t3.b  or  t2.b  is   null   )
      ->           on  t1.a  =   t2.a;
  +  --  ----+------+------+------+ 
 |  a     |  a     |  b     |  b     | 
 +  --  ----+------+------+------+ 
 |      1   |      1   |    101   |    101   | 
 |      2   |   NULL   |   NULL   |   NULL   | 
 +  --  ----+------+------+------+ 
 2  rows  in   set  ( 0.02   sec)

mysql  >   select   *   from 
     ->     (t1  left   join  t2  on  t1.a  =   t2.a)
      ->         left   join   t3
      ->             on  t2.b  =  t3.b  or  t2.b  is   null  ;
  +  --  ----+------+------+------+ 
 |  a     |  a     |  b     |  b     | 
 +  --  ----+------+------+------+ 
 |      1   |      1   |    101   |    101   | 
 |      2   |   NULL   |   NULL   |    101   | 
 +  --  ----+------+------+------+ 
 2  rows  in   set  ( 0.00  sec)


  <============================================================================>

t1  LEFT   JOIN  (t2, t3)  ON  t1.a = t2.a

 该表达式不能转变成:

t1  LEFT   JOIN  t2  ON  t1.a = t2.a, t3.
 Database   changed
mysql  >   select   *   from  t1  left   join  (t2,t3)  on  t1.a  =   t2.a;
  +  --  ----+------+------+------+ 
 |  a     |  a     |  b     |  b     | 
 +  --  ----+------+------+------+ 
 |      1   |      1   |    101   |    101   | 
 |      2   |   NULL   |   NULL   |   NULL   | 
 +  --  ----+------+------+------+ 
 2  rows  in   set  ( 0.00   sec)

mysql  >   select   *   from  t1  left   join  t2  on  t1.a  =   t2.a, t3;
  +  --  ----+------+------+------+ 
 |  a     |  a     |  b     |  b     | 
 +  --  ----+------+------+------+ 
 |      1   |      1   |    101   |    101   | 
 |      2   |   NULL   |   NULL   |    101   | 
 +  --  ----+------+------+------+ 
 2  rows  in   set  ( 0.00  sec)

不同的结果集

     因此,如果我们去掉有outer join操作的join表达式的圆括号,我们会改变原始表达式的结果集。

    更加准确的, 我们不能去掉左外连接的右括号操作,和右外连接的左括号操作。换句话,我们不能去掉外连接操作中的内表表达式的括号,外表的其他操作括号可以被去掉;

(t1,t2)  LEFT   JOIN  t3  ON  P(t2.b,t3.b)

等效于:

t1, t2  LEFT   JOIN  t3  ON  P(t2.b,t3.b)
 
 SELECT   *   FROM  t1  LEFT   JOIN  (t2  LEFT   JOIN  t3  ON  t2.b = t3.b)  ON  t1.a =  t2.a
    WHERE  t1.a  >   1 

 SELECT   *   FROM  t1  LEFT   JOIN  (t2, t3)  ON  t1.a =  t2.a
    WHERE  (t2.b = t3.b  OR  t2.b  IS   NULL )  AND  t1.a  >   1 

这梁查询都包含内连接:

t2  LEFT   JOIN  t3  ON  t2.b =  t3.b
t2, t3 

 

 SELECT   *   FROM  T1  INNER   JOIN  T2  ON   P1(T1,T2)
                   INNER   JOIN  T3  ON   P2(T2,T3)
    WHERE  P(T1,T2,T3).

该表达式的执行顺序:

 FOR  each row t1  in   T1 {
    FOR  each row t2  in   T2 such that P1(t1,t2) {
      FOR  each row t3  in   T3 such that P2(t2,t3) {
        IF   P(t1,t2,t3) {
         t:  = t1 || t2 ||  t3; OUTPUT t;
      }
    }
  }
} 

 

 SELECT   *   FROM  T1  LEFT   JOIN  
              (T2   LEFT   JOIN  T3  ON   P2(T2,T3))
                ON   P1(T1,T2)
    WHERE  P(T1,T2,T3).

顺序:

 FOR  each row t1  in   T1 {
  BOOL f1:  =  FALSE;
    FOR  each row t2  in   T2 such that P1(t1,t2) {
    BOOL f2:  =  FALSE;
      FOR  each row t3  in   T3 such that P2(t2,t3) {
        IF   P(t1,t2,t3) {
        t:  = t1 || t2 ||  t3; OUTPUT t;
      }
      f2  =  TRUE;
      f1  =  TRUE;
    }
      IF   (!f2) {
        IF  P(t1,t2, NULL  ) {
        t:  = t1 || t2 ||  NULL  ; OUTPUT t;
      }
      f1  =  TRUE;
    }
  }
    IF   (!f1) {
      IF  P(t1, NULL , NULL  ) {
      t:  = t1 ||  NULL  ||  NULL  ; OUTPUT t;
    }
  }
} 

    

    在一般情况下,对于在一外联接操作的任何嵌套循环的第一内表,一个标志被引入在循环之前处于关闭并在循环之后被检查。该标志被打开时,用于从外部表的当前行从代表内操作数表中找到一个匹配的。如果在循环周期结束的标志仍然关闭,没有发现匹配的外部表的当前行。在这种情况下,该行被用于内表的列补充NULL值。结果行被传递到最后的检查为输出或到下一个嵌套循环,但仅当行满足所有嵌入式外的连接条件。

     对于外连接的查询,优化器可以只选择为循环表外循环在内部表之前的顺序。因此,我们外部连接的查询,只有一个嵌套顺序是可能的。对于下面的查询,优化器将评估两个不同的嵌套:

 SELECT   *  T1  LEFT   JOIN  (T2,T3)  ON  P1(T1,T2)  AND   P2(T1,T3)
    WHERE  P(T1,T2,T3)
 FOR  each row t1  in   T1 {
  BOOL f1:  =  FALSE;
    FOR  each row t2  in   T2 such that P1(t1,t2) {
      FOR  each row t3  in   T3 such that P2(t1,t3) {
        IF   P(t1,t2,t3) {
        t:  = t1 || t2 ||  t3; OUTPUT t;
      }
      f1:  =  TRUE
    }
  }
    IF   (!f1) {
      IF  P(t1, NULL , NULL  ) {
      t:  = t1 ||  NULL  ||  NULL  ; OUTPUT t;
    }
  }
} 

和:

 FOR  each row t1  in   T1 {
  BOOL f1:  =  FALSE;
    FOR  each row t3  in   T3 such that P2(t1,t3) {
      FOR  each row t2  in   T2 such that P1(t1,t2) {
        IF   P(t1,t2,t3) {
        t:  = t1 || t2 ||  t3; OUTPUT t;
      }
      f1:  =  TRUE
    }
  }
    IF   (!f1) {
      IF  P(t1, NULL , NULL  ) {
      t:  = t1 ||  NULL  ||  NULL  ; OUTPUT t;
    }
  }
} 

 

如果P可以拆分:

P(T1,T2,T2) = C1(T1) AND C2(T2) AND C3(T3).

算法会变成

 FOR  each row t1  in   T1 such that C1(t1) {
    FOR  each row t2  in  T2 such that P1(t1,t2)  AND   C2(t2)  {
      FOR  each row t3  in  T3 such that P2(t2,t3)  AND   C3(t3) {
        IF   P(t1,t2,t3) {
         t:  = t1 || t2 ||  t3; OUTPUT t;
      }
    }
  }
} 

 

     每个合取C1(T1)的,C 2(T2),C 3(T3)被压出的最内环到最外循环在那里可以进行评估。如果C1(T 1)是一个非常严格的条件下,此条件下推可能大大从传递给内表T1减少行数。其结果是,在执行时间的查询可能提高极大。

 

MYSQL Nested Join Optimization

标签:

查看更多关于MYSQL Nested Join Optimization的详细内容...

  阅读:36次