ovn-northd(8)                 Open vSwitch Manual                ovn-northd(8)



NAME
       ovn-northd - Open Virtual Network central control daemon

SYNOPSIS
       ovn-northd [options]

DESCRIPTION
       ovn-northd  is  a  centralized  daemon  responsible for translating the
       high-level OVN configuration into logical configuration  consumable  by
       daemons  such as ovn-controller. It translates the logical network con‐
       figuration in terms of conventional network concepts,  taken  from  the
       OVN Northbound Database (see ovn-nb(5)), into logical datapath flows in
       the OVN Southbound Database (see ovn-sb(5)) below it.

OPTIONS
       --ovnnb-db=database
              The OVSDB database containing the OVN  Northbound  Database.  If
              the  OVN_NB_DB environment variable is set, its value is used as
              the default. Otherwise, the default is unix:/ovnnb_db.sock.

       --ovnsb-db=database
              The OVSDB database containing the OVN  Southbound  Database.  If
              the  OVN_SB_DB environment variable is set, its value is used as
              the default. Otherwise, the default is unix:/ovnsb_db.sock.

       database in the above options must be an OVSDB active or  passive  con‐
       nection method, as described in ovsdb(7).

   Daemon Options
       --pidfile[=pidfile]
              Causes a file (by default, program.pid) to be created indicating
              the PID of the running process. If the pidfile argument  is  not
              specified, or if it does not begin with /, then it is created in
              .

              If --pidfile is not specified, no pidfile is created.

       --overwrite-pidfile
              By default, when --pidfile is specified and the  specified  pid‐
              file already exists and is locked by a running process, the dae‐
              mon refuses to start. Specify --overwrite-pidfile to cause it to
              instead overwrite the pidfile.

              When --pidfile is not specified, this option has no effect.

       --detach
              Runs  this  program  as a background process. The process forks,
              and in the child it starts a new session,  closes  the  standard
              file descriptors (which has the side effect of disabling logging
              to the console), and changes its current directory to  the  root
              (unless  --no-chdir is specified). After the child completes its
              initialization, the parent exits.

       --monitor
              Creates an additional process to monitor  this  program.  If  it
              dies  due  to a signal that indicates a programming error (SIGA
              BRT, SIGALRM, SIGBUS, SIGFPE, SIGILL, SIGPIPE, SIGSEGV, SIGXCPU,
              or SIGXFSZ) then the monitor process starts a new copy of it. If
              the daemon dies or exits for another reason, the monitor process
              exits.

              This  option  is  normally used with --detach, but it also func‐
              tions without it.

       --no-chdir
              By default, when --detach is specified, the daemon  changes  its
              current  working  directory  to  the  root  directory  after  it
              detaches. Otherwise, invoking the daemon from a carelessly  cho‐
              sen  directory  would  prevent the administrator from unmounting
              the file system that holds that directory.

              Specifying --no-chdir suppresses this behavior,  preventing  the
              daemon  from changing its current working directory. This may be
              useful for collecting core files, since it is common behavior to
              write core dumps into the current working directory and the root
              directory is not a good directory to use.

              This option has no effect when --detach is not specified.

       --no-self-confinement
              By default this daemon will try to self-confine itself  to  work
              with  files  under  well-known  directories whitelisted at build
              time. It is better to stick with this default behavior  and  not
              to  use  this  flag  unless some other Access Control is used to
              confine daemon. Note that in contrast to  other  access  control
              implementations  that  are  typically enforced from kernel-space
              (e.g. DAC or MAC), self-confinement is imposed  from  the  user-
              space daemon itself and hence should not be considered as a full
              confinement strategy, but instead should be viewed as  an  addi‐
              tional layer of security.

       --user=user:group
              Causes  this  program  to  run  as a different user specified in
              user:group, thus dropping most of  the  root  privileges.  Short
              forms  user  and  :group  are also allowed, with current user or
              group assumed, respectively. Only daemons started  by  the  root
              user accepts this argument.

              On   Linux,   daemons   will   be   granted   CAP_IPC_LOCK   and
              CAP_NET_BIND_SERVICES before dropping root  privileges.  Daemons
              that  interact  with  a  datapath, such as ovs-vswitchd, will be
              granted three  additional  capabilities,  namely  CAP_NET_ADMIN,
              CAP_NET_BROADCAST  and  CAP_NET_RAW.  The capability change will
              apply even if the new user is root.

              On Windows, this option is not currently supported. For security
              reasons,  specifying  this  option will cause the daemon process
              not to start.

   Logging Options
       -v[spec]
       --verbose=[spec]
            Sets logging levels. Without any spec,  sets  the  log  level  for
            every  module and destination to dbg. Otherwise, spec is a list of
            words separated by spaces or commas or colons, up to one from each
            category below:

            ·      A  valid module name, as displayed by the vlog/list command
                   on ovs-appctl(8), limits the log level change to the speci‐
                   fied module.

            ·      syslog,  console, or file, to limit the log level change to
                   only to the system log, to  the  console,  or  to  a  file,
                   respectively.  (If --detach is specified, the daemon closes
                   its standard file descriptors, so logging  to  the  console
                   will have no effect.)

                   On  Windows  platform,  syslog is accepted as a word and is
                   only useful along with the --syslog-target option (the word
                   has no effect otherwise).

            ·      off,  emer,  err,  warn,  info,  or dbg, to control the log
                   level. Messages of the given severity  or  higher  will  be
                   logged,  and  messages  of  lower severity will be filtered
                   out. off filters out all messages. See ovs-appctl(8) for  a
                   definition of each log level.

            Case is not significant within spec.

            Regardless  of the log levels set for file, logging to a file will
            not take place unless --log-file is also specified (see below).

            For compatibility with older versions of OVS, any is accepted as a
            word but has no effect.

       -v
       --verbose
            Sets  the  maximum  logging  verbosity level, equivalent to --ver
            bose=dbg.

       -vPATTERN:destination:pattern
       --verbose=PATTERN:destination:pattern
            Sets  the  log  pattern  for  destination  to  pattern.  Refer  to
            ovs-appctl(8) for a description of the valid syntax for pattern.

       -vFACILITY:facility
       --verbose=FACILITY:facility
            Sets  the RFC5424 facility of the log message. facility can be one
            of kern, user, mail, daemon, auth, syslog, lpr, news, uucp, clock,
            ftp,  ntp,  audit,  alert, clock2, local0, local1, local2, local3,
            local4, local5, local6 or local7. If this option is not specified,
            daemon  is  used  as  the  default for the local system syslog and
            local0 is used while sending a message to the target provided  via
            the --syslog-target option.

       --log-file[=file]
            Enables  logging  to a file. If file is specified, then it is used
            as the exact name for the log file. The default log file name used
            if file is omitted is /usr/local/var/log/ovn/program.log.

       --syslog-target=host:port
            Send  syslog messages to UDP port on host, in addition to the sys‐
            tem syslog. The host must be a numerical IP address, not  a  host‐
            name.

       --syslog-method=method
            Specify  method  as  how  syslog messages should be sent to syslog
            daemon. The following forms are supported:

            ·      libc, to use the libc syslog() function. Downside of  using
                   this  options  is that libc adds fixed prefix to every mes‐
                   sage before it is actually sent to the syslog  daemon  over
                   /dev/log UNIX domain socket.

            ·      unix:file, to use a UNIX domain socket directly. It is pos‐
                   sible to specify arbitrary message format with this option.
                   However,  rsyslogd  8.9  and  older versions use hard coded
                   parser function anyway that limits UNIX domain socket  use.
                   If  you  want  to  use  arbitrary message format with older
                   rsyslogd versions, then use  UDP  socket  to  localhost  IP
                   address instead.

            ·      udp:ip:port,  to  use  a UDP socket. With this method it is
                   possible to use arbitrary message format  also  with  older
                   rsyslogd.  When  sending  syslog  messages  over UDP socket
                   extra precaution needs to be taken into account, for  exam‐
                   ple,  syslog daemon needs to be configured to listen on the
                   specified UDP port,  accidental  iptables  rules  could  be
                   interfering  with  local  syslog traffic and there are some
                   security considerations that apply to UDP sockets,  but  do
                   not apply to UNIX domain sockets.

            ·      null, to discard all messages logged to syslog.

            The  default is taken from the OVS_SYSLOG_METHOD environment vari‐
            able; if it is unset, the default is libc.

   PKI Options
       PKI configuration is required in order to use SSL for  the  connections
       to the Northbound and Southbound databases.

              -p privkey.pem
              --private-key=privkey.pem
                   Specifies  a  PEM  file  containing the private key used as
                   identity for outgoing SSL connections.

              -c cert.pem
              --certificate=cert.pem
                   Specifies a PEM file containing a certificate  that  certi‐
                   fies the private key specified on -p or --private-key to be
                   trustworthy. The certificate must be signed by the certifi‐
                   cate  authority  (CA) that the peer in SSL connections will
                   use to verify it.

              -C cacert.pem
              --ca-cert=cacert.pem
                   Specifies a PEM file containing the CA certificate for ver‐
                   ifying certificates presented to this program by SSL peers.
                   (This may be the same certificate that  SSL  peers  use  to
                   verify the certificate specified on -c or --certificate, or
                   it may be a different one, depending on the PKI  design  in
                   use.)

              -C none
              --ca-cert=none
                   Disables  verification  of  certificates  presented  by SSL
                   peers. This introduces a security risk,  because  it  means
                   that  certificates  cannot be verified to be those of known
                   trusted hosts.

   Other Options
       --unixctl=socket
              Sets the name of the control socket on which program listens for
              runtime  management  commands  (see RUNTIME MANAGEMENT COMMANDS,
              below). If socket does not begin with /, it  is  interpreted  as
              relative  to  .  If  --unixctl  is  not used at all, the default
              socket is /program.pid.ctl, where pid is program’s process ID.

              On Windows a local named pipe is used to listen for runtime man‐
              agement  commands.  A  file  is  created in the absolute path as
              pointed by socket or if --unixctl is not used at all, a file  is
              created  as  program in the configured OVS_RUNDIR directory. The
              file exists just to mimic the behavior of a Unix domain socket.

              Specifying none for socket disables the control socket feature.



       -h
       --help
            Prints a brief help message to the console.

       -V
       --version
            Prints version information to the console.

RUNTIME MANAGEMENT COMMANDS
       ovs-appctl can send commands to a running ovn-northd process. The  cur‐
       rently supported commands are described below.

              exit   Causes ovn-northd to gracefully terminate.

              pause  Pauses  the  ovn-northd  operation  from  processing  any
                     Northbound and Southbound  database  changes.  This  will
                     also instruct ovn-northd to drop any lock on SB DB.

              resume Resumes  the  ovn-northd  operation to process Northbound
                     and Southbound database  contents  and  generate  logical
                     flows.  This  will also instruct ovn-northd to aspire for
                     the lock on SB DB.

              is-paused
                     Returns "true" if ovn-northd is currently paused, "false"
                     otherwise.

              status Prints  this  server’s status. Status will be "active" if
                     ovn-northd has acquired OVSDB lock on SB DB, "standby" if
                     it has not or "paused" if this instance is paused.

ACTIVE-STANDBY FOR HIGH AVAILABILITY
       You  may  run ovn-northd more than once in an OVN deployment. When con‐
       nected to a standalone or clustered DB setup,  OVN  will  automatically
       ensure that only one of them is active at a time. If multiple instances
       of ovn-northd are running and the active ovn-northd fails, one  of  the
       hot standby instances of ovn-northd will automatically take over.

   Active-Standby with multiple OVN DB servers
       You may run multiple OVN DB servers in an OVN deployment with:

              ·      OVN  DB  servers deployed in active/passive mode with one
                     active and multiple passive ovsdb-servers.

              ·      ovn-northd also deployed on all these nodes,  using  unix
                     ctl sockets to connect to the local OVN DB servers.

       In  such deployments, the ovn-northds on the passive nodes will process
       the DB changes and compute  logical  flows  to  be  thrown  out  later,
       because  write  transactions  are  not  allowed  by  the passive ovsdb-
       servers. It results in unnecessary CPU usage.

       With the help of  runtime  management  command  pause,  you  can  pause
       ovn-northd  on these nodes. When a passive node becomes master, you can
       use the runtime management command resume to resume the  ovn-northd  to
       process the DB changes.

LOGICAL FLOW TABLE STRUCTURE
       One  of the main purposes of ovn-northd is to populate the Logical_Flow
       table in  the  OVN_Southbound  database.  This  section  describes  how
       ovn-northd does this for switch and router logical datapaths.

   Logical Switch Datapaths
     Ingress Table 0: Admission Control and Ingress Port Security - L2

       Ingress table 0 contains these logical flows:

              ·      Priority 100 flows to drop packets with VLAN tags or mul‐
                     ticast Ethernet source addresses.

              ·      Priority 50 flows that implement  ingress  port  security
                     for each enabled logical port. For logical ports on which
                     port security is enabled, these match the inport and  the
                     valid  eth.src address(es) and advance only those packets
                     to the next flow table. For logical ports on  which  port
                     security  is  not enabled, these advance all packets that
                     match the inport.

       There are no flows for disabled logical ports because the  default-drop
       behavior  of  logical flow tables causes packets that ingress from them
       to be dropped.

     Ingress Table 1: Ingress Port Security - IP

       Ingress table 1 contains these logical flows:

              ·      For each element in the port security set having  one  or
                     more IPv4 or IPv6 addresses (or both),

                     ·      Priority  90  flow to allow IPv4 traffic if it has
                            IPv4  addresses  which  match  the  inport,  valid
                            eth.src and valid ip4.src address(es).

                     ·      Priority  90  flow  to  allow  IPv4 DHCP discovery
                            traffic if it has a valid eth.src. This is  neces‐
                            sary  since  DHCP discovery messages are sent from
                            the unspecified IPv4 address (0.0.0.0)  since  the
                            IPv4 address has not yet been assigned.

                     ·      Priority  90  flow to allow IPv6 traffic if it has
                            IPv6  addresses  which  match  the  inport,  valid
                            eth.src and valid ip6.src address(es).

                     ·      Priority  90  flow  to  allow  IPv6 DAD (Duplicate
                            Address Detection)  traffic  if  it  has  a  valid
                            eth.src.  This  is  is necessary since DAD include
                            requires joining an multicast  group  and  sending
                            neighbor  solicitations  for  the  newly  assigned
                            address. Since no address is yet  assigned,  these
                            are sent from the unspecified IPv6 address (::).

                     ·      Priority  80  flow to drop IP (both IPv4 and IPv6)
                            traffic which match the inport and valid eth.src.

              ·      One priority-0 fallback flow that matches all packets and
                     advances to the next table.

     Ingress Table 2: Ingress Port Security - Neighbor discovery

       Ingress table 2 contains these logical flows:

              ·      For each element in the port security set,

                     ·      Priority  90 flow to allow ARP traffic which match
                            the inport and valid eth.src and arp.sha.  If  the
                            element  has  one  or more IPv4 addresses, then it
                            also matches the valid arp.spa.

                     ·      Priority 90 flow to allow IPv6 Neighbor  Solicita‐
                            tion  and  Advertisement  traffic  which match the
                            inport, valid eth.src and  nd.sll/nd.tll.  If  the
                            element  has  one  or more IPv6 addresses, then it
                            also matches the valid nd.target  address(es)  for
                            Neighbor Advertisement traffic.

                     ·      Priority  80  flow  to  drop ARP and IPv6 Neighbor
                            Solicitation and Advertisement traffic which match
                            the inport and valid eth.src.

              ·      One priority-0 fallback flow that matches all packets and
                     advances to the next table.

     Ingress Table 3:from-lportPre-ACLs

       This table prepares flows  for  possible  stateful  ACL  processing  in
       ingress  table  ACLs.  It  contains a priority-0 flow that simply moves
       traffic to the next table. If stateful ACLs are  used  in  the  logical
       datapath, a priority-100 flow is added that sets a hint (with reg0[0] =
       1; next;) for table Pre-stateful to send IP packets to  the  connection
       tracker  before  eventually advancing to ingress table ACLs. If special
       ports such as route ports or localnet ports can’t use  ct(),  a  prior‐
       ity-110 flow is added to skip over stateful ACLs.

       This table also has a priority-110 flow with the match eth.dst == E for
       all logical switch datapaths to move traffic to the next table. Where E
       is the service monitor mac defined in the options:svc_monitor_mac colum
       of NB_Global table.

     Ingress Table 4: Pre-LB

       This table prepares flows for possible stateful load balancing process‐
       ing  in  ingress  table  LB and Stateful. It contains a priority-0 flow
       that simply moves traffic to the next table.  Moreover  it  contains  a
       priority-110  flow  to move IPv6 Neighbor Discovery traffic to the next
       table. If load balancing rules with virtual IP  addresses  (and  ports)
       are  configured  in  OVN_Northbound database for a logical switch data‐
       path, a priority-100 flow is  added  for  each  configured  virtual  IP
       address VIP. For IPv4 VIPs, the match is ip &&&& ip4.dst == VIP. For IPv6
       VIPs, the match is ip &&&& ip6.dst  ==  VIP.  The  flow  sets  an  action
       reg0[0]  =  1; next; to act as a hint for table Pre-stateful to send IP
       packets to the connection tracker for  packet  de-fragmentation  before
       eventually  advancing to ingress table LB. If controller_event has been
       enabled and load balancing rules with empty backends have been added in
       OVN_Northbound,  a  130  flow is added to trigger ovn-controller events
       whenever the chassis receives a packet  for  that  particular  VIP.  If
       event-elb  meter  has been previously created, it will be associated to
       the empty_lb logical flow

       This table also has a priority-110 flow with the match eth.dst == E for
       all logical switch datapaths to move traffic to the next table. Where E
       is the service monitor mac defined in the options:svc_monitor_mac colum
       of NB_Global table.

     Ingress Table 5: Pre-stateful

       This  table prepares flows for all possible stateful processing in next
       tables. It contains a priority-0 flow that simply moves traffic to  the
       next table. A priority-100 flow sends the packets to connection tracker
       based on a hint provided by the  previous  tables  (with  a  match  for
       reg0[0] == 1) by using the ct_next; action.

     Ingress table 6:from-lportACLs

       Logical flows in this table closely reproduce those in the ACL table in
       the OVN_Northbound database for the from-lport direction. The  priority
       values  from  the ACL table have a limited range and have 1000 added to
       them to leave room for OVN default flows at both higher and lower  pri‐
       orities.

              ·      allow  ACLs  translate  into logical flows with the next;
                     action. If there are any stateful ACLs on this  datapath,
                     then allow ACLs translate to ct_commit; next; (which acts
                     as a hint for the next tables to commit the connection to
                     conntrack),

              ·      allow-related  ACLs translate into logical flows with the
                     ct_commit(ct_label=0/1); next; actions  for  new  connec‐
                     tions and reg0[1] = 1; next; for existing connections.

              ·      Other  ACLs  translate to drop; for new or untracked con‐
                     nections and ct_commit(ct_label=1/1); for  known  connec‐
                     tions.  Setting  ct_label  marks a connection as one that
                     was previously allowed, but should no longer  be  allowed
                     due to a policy change.

       This  table  also contains a priority 0 flow with action next;, so that
       ACLs allow packets by default. If the logical datapath has a  statetful
       ACL, the following flows will also be added:

              ·      A priority-1 flow that sets the hint to commit IP traffic
                     to the connection  tracker  (with  action  reg0[1]  =  1;
                     next;).  This  is  needed  for  the  default allow policy
                     because, while the initiator’s direction may not have any
                     stateful  rules,  the  server’s  may  and then its return
                     traffic would not be known and marked as invalid.

              ·      A priority-65535 flow that  allows  any  traffic  in  the
                     reply  direction for a connection that has been committed
                     to the connection tracker (i.e., established  flows),  as
                     long as the committed flow does not have ct_label.blocked
                     set. We only handle traffic in the reply  direction  here
                     because  we  want all packets going in the request direc‐
                     tion to still go through the  flows  that  implement  the
                     currently  defined  policy based on ACLs. If a connection
                     is no longer allowed by policy, ct_label.blocked will get
                     set  and packets in the reply direction will no longer be
                     allowed, either.

              ·      A priority-65535 flow that allows  any  traffic  that  is
                     considered  related to a committed flow in the connection
                     tracker (e.g., an ICMP Port Unreachable from  a  non-lis‐
                     tening  UDP port), as long as the committed flow does not
                     have ct_label.blocked set.

              ·      A priority-65535 flow that drops all  traffic  marked  by
                     the connection tracker as invalid.

              ·      A priority-65535 flow that drops all traffic in the reply
                     direction with ct_label.blocked set meaning that the con‐
                     nection  should  no  longer  be  allowed  due to a policy
                     change. Packets in the request direction are skipped here
                     to let a newly created ACL re-allow this connection.

              ·      A  priority  34000 logical flow is added for each logical
                     switch datapath with the match eth.dst = E to  allow  the
                     service  monitor  reply packet destined to ovn-controller
                     with the action next, where E is the service monitor  mac
                     defined in the options:svc_monitor_mac colum of NB_Global
                     table.

     Ingress Table 7:from-lportQoS Marking

       Logical flows in this table closely reproduce those in  the  QoS  table
       with  the  action  column  set  in  the OVN_Northbound database for the
       from-lport direction.

              ·      For every qos_rules entry in a logical switch  with  DSCP
                     marking  enabled,  a  flow  will be added at the priority
                     mentioned in the QoS table.

              ·      One priority-0 fallback flow that matches all packets and
                     advances to the next table.

     Ingress Table 8:from-lportQoS Meter

       Logical  flows  in  this table closely reproduce those in the QoS table
       with the bandwidth column set in the OVN_Northbound  database  for  the
       from-lport direction.

              ·      For every qos_rules entry in a logical switch with meter‐
                     ing enabled, a flow will be added at the  priorirty  men‐
                     tioned in the QoS table.

              ·      One priority-0 fallback flow that matches all packets and
                     advances to the next table.

     Ingress Table 9: LB

       It contains a priority-0 flow that simply moves traffic to the next ta‐
       ble.  For established connections a priority 100 flow matches on ct.est
       &&&& !ct.rel &&&& !ct.new &&&& !ct.inv and sets an action reg0[2] = 1;  next;
       to  act as a hint for table Stateful to send packets through connection
       tracker to NAT the packets. (The packet will automatically  get  DNATed
       to the same IP address as the first packet in that connection.)

     Ingress Table 10: Stateful

              ·      For  all the configured load balancing rules for a switch
                     in OVN_Northbound database that includes a L4  port  PORT
                     of  protocol P and IP address VIP, a priority-120 flow is
                     added. For IPv4 VIPs , the flow matches ct.new &&&&  ip  &&&&
                     ip4.dst  == VIP &&&& P &&&& P.dst == PORT. For IPv6 VIPs, the
                     flow matches ct.new &&&& ip &&&& ip6.dst == VIP &&&& P &&&& P.dst
                     ==  PORT.  The  flow’s action is ct_lb(args) , where args
                     contains comma separated IP addresses (and optional  port
                     numbers) to load balance to. The address family of the IP
                     addresses of args is the same as the  address  family  of
                     VIP. If health check is enabled, then args will only con‐
                     tain those endpoints whose service monitor  status  entry
                     in OVN_Southbound db is either online or empty.

              ·      For  all the configured load balancing rules for a switch
                     in OVN_Northbound  database  that  includes  just  an  IP
                     address  VIP  to  match on, OVN adds a priority-110 flow.
                     For IPv4 VIPs, the flow matches ct.new &&&& ip  &&&&  ip4.dst
                     ==  VIP.  For IPv6 VIPs, the flow matches ct.new &&&& ip &&&&
                     ip6.dst == VIP. The action on this flow  is  ct_lb(args),
                     where  args  contains comma separated IP addresses of the
                     same address family as VIP.

              ·      A priority-100 flow commits packets to connection tracker
                     using ct_commit; next; action based on a hint provided by
                     the previous tables (with a match for reg0[1] == 1).

              ·      A priority-100  flow  sends  the  packets  to  connection
                     tracker  using  ct_lb; as the action based on a hint pro‐
                     vided by the previous tables (with a match for reg0[2] ==
                     1).

              ·      A  priority-0  flow that simply moves traffic to the next
                     table.

     Ingress Table 11: Pre-Hairpin

              ·      For all configured load balancer  backends  a  priority-2
                     flow that matches on traffic that needs to be hairpinned,
                     i.e., after load balancing the destination IP matches the
                     source   IP,   which  sets  reg0[6]  =  1   and  executes
                     ct_snat(VIP) to force replies to these  packets  to  come
                     back through OVN.

              ·      For  all  configured  load balancer backends a priority-1
                     flow that matches on replies to hairpinned traffic, i.e.,
                     destination  IP  is  VIP, source IP is the backend IP and
                     source L4 port is backend port, which sets  reg0[6]  =  1
                     and executes ct_snat;.

              ·      A  priority-0  flow that simply moves traffic to the next
                     table.

     Ingress Table 12: Hairpin

              ·      A priority-1 flow that hairpins traffic matched  by  non-
                     default  flows  in  the Pre-Hairpin table. Hairpinning is
                     done at L2, Ethernet addresses are swapped and the  pack‐
                     ets are looped back on the input port.

              ·      A  priority-0  flow that simply moves traffic to the next
                     table.

     Ingress Table 13: ARP/ND responder

       This table implements ARP/ND responder in a logical  switch  for  known
       IPs. The advantage of the ARP responder flow is to limit ARP broadcasts
       by locally responding to ARP requests without the need to send to other
       hypervisors. One common case is when the inport is a logical port asso‐
       ciated with a VIF and the broadcast is responded to on the local hyper‐
       visor  rather  than broadcast across the whole network and responded to
       by the destination VM. This behavior is proxy ARP.

       ARP requests arrive from VMs from  a  logical  switch  inport  of  type
       default.  For  this case, the logical switch proxy ARP rules can be for
       other VMs or logical router ports. Logical switch proxy ARP  rules  may
       be  programmed  both  for  mac binding of IP addresses on other logical
       switch VIF ports (which are of the default logical  switch  port  type,
       representing connectivity to VMs or containers), and for mac binding of
       IP addresses on logical switch router type  ports,  representing  their
       logical  router  port  peers. In order to support proxy ARP for logical
       router ports, an IP address must be configured on  the  logical  switch
       router  type port, with the same value as the peer logical router port.
       The configured MAC addresses must match as well. When a VM sends an ARP
       request  for  a  distributed logical router port and if the peer router
       type port of the attached logical switch does not have  an  IP  address
       configured,  the  ARP  request will be broadcast on the logical switch.
       One of the copies of the ARP request will go through the logical switch
       router  type  port  to  the  logical router datapath, where the logical
       router ARP responder will generate a reply. The MAC binding of  a  dis‐
       tributed  logical router, once learned by an associated VM, is used for
       all that VM’s communication needing routing. Hence, the action of a  VM
       re-arping  for  the  mac  binding  of the logical router port should be
       rare.

       Logical switch ARP responder proxy ARP  rules  can  also  be  hit  when
       receiving  ARP  requests externally on a L2 gateway port. In this case,
       the hypervisor acting as an L2 gateway, responds to the ARP request  on
       behalf of a destination VM.

       Note  that  ARP requests received from localnet or vtep logical inports
       can either go directly to VMs, in which case the VM responds or can hit
       an  ARP  responder  for  a logical router port if the packet is used to
       resolve a logical router port next hop address. In either case, logical
       switch  ARP  responder rules will not be hit. It contains these logical
       flows:

              ·      Priority-100 flows to skip the ARP responder if inport is
                     of  type  localnet  or  vtep and advances directly to the
                     next table. ARP requests sent to localnet or  vtep  ports
                     can be received by multiple hypervisors. Now, because the
                     same mac binding rules are downloaded to all hypervisors,
                     each  of the multiple hypervisors will respond. This will
                     confuse L2 learning on the source of  the  ARP  requests.
                     ARP requests received on an inport of type router are not
                     expected to hit any logical switch ARP  responder  flows.
                     However,  no  skip flows are installed for these packets,
                     as there would be some additional flow cost for this  and
                     the value appears limited.

              ·      If  inport V is of type virtual adds a priority-100 logi‐
                     cal flow for each P configured  in  the  options:virtual-
                     parents column with the match

                     inport == P &&&& &&&& ((arp.op == 1 &&&& arp.spa == VIP &&&& arp.tpa == VIP) || (arp.op == 2 &&&& arp.spa == VIP))


                     and applies the action

                     bind_vport(V, inport);


                     and advances the packet to the next table.

                     Where  VIP  is  the  virtual  ip configured in the column
                     options:virtual-ip.

              ·      Priority-50 flows that match ARP requests to  each  known
                     IP  address  A  of every logical switch port, and respond
                     with ARP replies  directly  with  corresponding  Ethernet
                     address E:

                     eth.dst = eth.src;
                     eth.src = E;
                     arp.op = 2; /* ARP reply. */
                     arp.tha = arp.sha;
                     arp.sha = E;
                     arp.tpa = arp.spa;
                     arp.spa = A;
                     outport = inport;
                     flags.loopback = 1;
                     output;


                     These  flows  are  omitted  for logical ports (other than
                     router ports or localport ports) that are  down  and  for
                     logical ports of type virtual.

              ·      Priority-50  flows  that match IPv6 ND neighbor solicita‐
                     tions to each known IP address A (and A’s solicited  node
                     address)  of  every  logical  switch  port except of type
                     router, and respond with neighbor advertisements directly
                     with corresponding Ethernet address E:

                     nd_na {
                         eth.src = E;
                         ip6.src = A;
                         nd.target = A;
                         nd.tll = E;
                         outport = inport;
                         flags.loopback = 1;
                         output;
                     };


                     Priority-50  flows  that match IPv6 ND neighbor solicita‐
                     tions to each known IP address A (and A’s solicited  node
                     address)  of  logical  switch  port  of  type router, and
                     respond with neighbor advertisements directly with corre‐
                     sponding Ethernet address E:

                     nd_na_router {
                         eth.src = E;
                         ip6.src = A;
                         nd.target = A;
                         nd.tll = E;
                         outport = inport;
                         flags.loopback = 1;
                         output;
                     };


                     These  flows  are  omitted  for logical ports (other than
                     router ports or localport ports) that are  down  and  for
                     logical ports of type virtual.

              ·      Priority-100  flows  with match criteria like the ARP and
                     ND flows above, except that they only match packets  from
                     the  inport  that owns the IP addresses in question, with
                     action next;. These flows prevent OVN from  replying  to,
                     for  example,  an ARP request emitted by a VM for its own
                     IP address. A VM only  makes  this  kind  of  request  to
                     attempt  to  detect a duplicate IP address assignment, so
                     sending a reply will prevent the VM from accepting the IP
                     address that it owns.

                     In  place  of  next;, it would be reasonable to use drop;
                     for the flows’ actions. If everything is working as it is
                     configured,  then  this would produce equivalent results,
                     since no host should reply to the request. But ARPing for
                     one’s  own  IP  address  is intended to detect situations
                     where the network is not working as configured, so  drop‐
                     ping the request would frustrate that intent.

              ·      For  each  SVC_MON_SRC_IP  defined  in  the  value of the
                     ip_port_mappings:ENDPOINT_IP column of Load_Balancer  ta‐
                     ble,  priority-110  logical  flow is added with the match
                     arp.tpa == SVC_MON_SRC_IP &&&& &&&& arp.op == 1  and  applies
                     the action

                     eth.dst = eth.src;
                     eth.src = E;
                     arp.op = 2; /* ARP reply. */
                     arp.tha = arp.sha;
                     arp.sha = E;
                     arp.tpa = arp.spa;
                     arp.spa = A;
                     outport = inport;
                     flags.loopback = 1;
                     output;


                     where  E is the service monitor source mac defined in the
                     options:svc_monitor_mac column in  the  NB_Global  table.
                     This mac is used as the source mac in the service monitor
                     packets for the load balancer endpoint IP health checks.

                     SVC_MON_SRC_IP is used as the source ip  in  the  service
                     monitor  IPv4  packets  for the load balancer endpoint IP
                     health checks.

                     These flows are required if an ARP request  is  sent  for
                     the IP SVC_MON_SRC_IP.

              ·      For  each  VIP configured in the table Forwarding_Group a
                     priority-50 logical flow is added with the match  arp.tpa
                     == vip &&&& &&&& arp.op == 1
                      and applies the action

                     eth.dst = eth.src;
                     eth.src = E;
                     arp.op = 2; /* ARP reply. */
                     arp.tha = arp.sha;
                     arp.sha = E;
                     arp.tpa = arp.spa;
                     arp.spa = A;
                     outport = inport;
                     flags.loopback = 1;
                     output;


                     where  E  is  the  forwarding  group’s mac defined in the
                     vmac.

                     A is used as either the destination ip for load balancing
                     traffic  to child ports or as nexthop to hosts behind the
                     child ports.

                     These flows are required to respond to an ARP request  if
                     an ARP request is sent for the IP vip.

              ·      One priority-0 fallback flow that matches all packets and
                     advances to the next table.

     Ingress Table 14: DHCP option processing

       This table adds the DHCPv4 options to a DHCPv4 packet from the  logical
       ports  configured  with  IPv4 address(es) and DHCPv4 options, and simi‐
       larly for DHCPv6 options. This table also adds flows  for  the  logical
       ports of type external.

              ·      A  priority-100  logical  flow is added for these logical
                     ports which matches the IPv4 packet with udp.src = 68 and
                     udp.dst  =  67  and  applies the action put_dhcp_opts and
                     advances the packet to the next table.

                     reg0[3] = put_dhcp_opts(offer_ip = ip, options...);
                     next;


                     For DHCPDISCOVER and  DHCPREQUEST,  this  transforms  the
                     packet  into  a DHCP reply, adds the DHCP offer IP ip and
                     options to the packet, and stores  1  into  reg0[3].  For
                     other  kinds  of  packets, it just stores 0 into reg0[3].
                     Either way, it continues to the next table.

              ·      A priority-100 logical flow is added  for  these  logical
                     ports  which  matches  the IPv6 packet with udp.src = 546
                     and udp.dst = 547 and applies the action  put_dhcpv6_opts
                     and advances the packet to the next table.

                     reg0[3] = put_dhcpv6_opts(ia_addr = ip, options...);
                     next;


                     For  DHCPv6  Solicit/Request/Confirm packets, this trans‐
                     forms the packet into a DHCPv6 Advertise/Reply, adds  the
                     DHCPv6  offer IP ip and options to the packet, and stores
                     1 into reg0[3]. For  other  kinds  of  packets,  it  just
                     stores  0  into  reg0[3]. Either way, it continues to the
                     next table.

              ·      A priority-0 flow that matches all packets to advances to
                     table 15.

     Ingress Table 15: DHCP responses

       This  table implements DHCP responder for the DHCP replies generated by
       the previous table.

              ·      A priority 100 logical flow  is  added  for  the  logical
                     ports  configured  with DHCPv4 options which matches IPv4
                     packets with udp.src == 68 &&&& udp.dst == 67 &&&& reg0[3] ==
                     1  and  responds  back to the inport after applying these
                     actions. If reg0[3] is set to 1, it means that the action
                     put_dhcp_opts was successful.

                     eth.dst = eth.src;
                     eth.src = E;
                     ip4.src = S;
                     udp.src = 67;
                     udp.dst = 68;
                     outport = P;
                     flags.loopback = 1;
                     output;


                     where  E  is  the  server MAC address and S is the server
                     IPv4 address defined in the  DHCPv4  options.  Note  that
                     ip4.dst field is handled by put_dhcp_opts.

                     (This  terminates  ingress  packet processing; the packet
                     does not go to the next ingress table.)

              ·      A priority 100 logical flow  is  added  for  the  logical
                     ports  configured  with DHCPv6 options which matches IPv6
                     packets with udp.src == 546 &&&& udp.dst == 547 &&&&  reg0[3]
                     == 1 and responds back to the inport after applying these
                     actions. If reg0[3] is set to 1, it means that the action
                     put_dhcpv6_opts was successful.

                     eth.dst = eth.src;
                     eth.src = E;
                     ip6.dst = A;
                     ip6.src = S;
                     udp.src = 547;
                     udp.dst = 546;
                     outport = P;
                     flags.loopback = 1;
                     output;


                     where  E  is  the  server MAC address and S is the server
                     IPv6 LLA address generated from the server_id defined  in
                     the  DHCPv6  options and A is the IPv6 address defined in
                     the logical port’s addresses column.

                     (This terminates packet processing; the packet  does  not
                     go on the next ingress table.)

              ·      A priority-0 flow that matches all packets to advances to
                     table 16.

     Ingress Table 16 DNS Lookup

       This table looks up and resolves the DNS  names  to  the  corresponding
       configured IP address(es).

              ·      A priority-100 logical flow for each logical switch data‐
                     path if it is configured with DNS records, which  matches
                     the  IPv4  and IPv6 packets with udp.dst = 53 and applies
                     the action dns_lookup and advances the packet to the next
                     table.

                     reg0[4] = dns_lookup(); next;


                     For  valid DNS packets, this transforms the packet into a
                     DNS reply if the DNS name can be resolved, and  stores  1
                     into reg0[4]. For failed DNS resolution or other kinds of
                     packets, it just stores 0 into reg0[4].  Either  way,  it
                     continues to the next table.

     Ingress Table 17 DNS Responses

       This  table  implements  DNS responder for the DNS replies generated by
       the previous table.

              ·      A priority-100 logical flow for each logical switch data‐
                     path  if it is configured with DNS records, which matches
                     the IPv4 and IPv6 packets with udp.dst = 53 &&&& reg0[4] ==
                     1  and  responds  back to the inport after applying these
                     actions. If reg0[4] is set to 1, it means that the action
                     dns_lookup was successful.

                     eth.dst ->gt;>gt; eth.src;
                     ip4.src ->gt;>gt; ip4.dst;
                     udp.dst = udp.src;
                     udp.src = 53;
                     outport = P;
                     flags.loopback = 1;
                     output;


                     (This  terminates  ingress  packet processing; the packet
                     does not go to the next ingress table.)

     Ingress table 18 External ports

       Traffic from the external logical  ports  enter  the  ingress  datapath
       pipeline via the localnet port. This table adds the below logical flows
       to handle the traffic from these ports.

              ·      A priority-100 flow is added for  each  external  logical
                     port  which  doesn’t  reside  on  a  chassis  to drop the
                     ARP/IPv6 NS request to the router IP(s) (of  the  logical
                     switch) which matches on the inport of the external logi‐
                     cal port and the valid eth.src address(es) of the  exter
                     nal logical port.

                     This  flow  guarantees  that  the  ARP/NS  request to the
                     router IP address from the external ports is responded by
                     only  the chassis which has claimed these external ports.
                     All the other chassis, drops these packets.

              ·      A priority-0 flow that matches all packets to advances to
                     table 19.

     Ingress Table 19 Destination Lookup

       This  table  implements  switching  behavior. It contains these logical
       flows:

              ·      A priorirty-110 flow with the match eth.src == E for  all
                     logical  switch  datapaths  and  applies  the action han
                     dle_svc_check(inport). Where E is the service monitor mac
                     defined in the options:svc_monitor_mac colum of NB_Global
                     table.

              ·      A priority-100 flow that punts all  IGMP/MLD  packets  to
                     ovn-controller  if  multicast  snooping is enabled on the
                     logical switch. The flow also forwards the IGMP/MLD pack‐
                     ets  to  the  MC_MROUTER_STATIC  multicast  group,  which
                     ovn-northd populates with all the logical ports that have
                     options :mcast_flood_reports=true.

              ·      Priority-90  flows  that  forward registered IP multicast
                     traffic to their  corresponding  multicast  group,  which
                     ovn-northd  creates  based  on learnt IGMP_Group entries.
                     The flows also forward packets  to  the  MC_MROUTER_FLOOD
                     multicast  group, which ovn-nortdh populates with all the
                     logical ports that are connected to logical routers  with
                     options:mcast_relay=’true’.

              ·      A priority-85 flow that forwards all IP multicast traffic
                     destined to 224.0.0.X to the  MC_FLOOD  multicast  group,
                     which  ovn-northd  populates  with  all  enabled  logical
                     ports.

              ·      A priority-85 flow that forwards all IP multicast traffic
                     destined  to reserved multicast IPv6 addresses (RFC 4291,
                     2.7.1, e.g., Solicited-Node multicast)  to  the  MC_FLOOD
                     multicast  group,  which  ovn-northd  populates  with all
                     enabled logical ports.

              ·      A priority-80 flow that forwards all unregistered IP mul‐
                     ticast  traffic  to  the MC_STATIC multicast group, which
                     ovn-northd populates with all the logical ports that have
                     options   :mcast_flood=true.  The  flow  also  forwards
                     unregistered IP multicast traffic to the MC_MROUTER_FLOOD
                     multicast  group, which ovn-northd populates with all the
                     logical ports connected  to  logical  routers  that  have
                     options :mcast_relay=true.

              ·      A  priority-80 flow that drops all unregistered IP multi‐
                     cast  traffic  if  other_config  :mcast_snoop=true  and
                     other_config  :mcast_flood_unregistered=false  and  the
                     switch is not connected to  a  logical  router  that  has
                     options  :mcast_relay=true  and the switch doesn’t have
                     any logical port with options :mcast_flood=true.

              ·      Priority-80 flows for each port connected  to  a  logical
                     router matching self originated GARP/ARP request/ND pack‐
                     ets. These packets are flooded to the MC_FLOOD which con‐
                     tains all logical ports.

              ·      Priority-75  flows  for  each  IP address/VIP/NAT address
                     owned by a router port connected  to  the  switch.  These
                     flows  match ARP requests and ND packets for the specific
                     IP addresses. Matched packets are forwarded only  to  the
                     router that owns the IP address.

              ·      A  priority-70 flow that outputs all packets with an Eth‐
                     ernet broadcast or multicast eth.dst to the MC_FLOOD mul‐
                     ticast group.

              ·      One  priority-50  flow  that  matches each known Ethernet
                     address against eth.dst and outputs  the  packet  to  the
                     single associated output port.

                     For the Ethernet address on a logical switch port of type
                     router, when that logical switch port’s addresses  column
                     is  set  to  router and the connected logical router port
                     specifies a redirect-chassis:

                     ·      The flow for the connected logical  router  port’s
                            Ethernet  address  is only programmed on the redi
                            rect-chassis.

                     ·      If the logical router has rules specified  in  nat
                            with  external_mac,  then those addresses are also
                            used to populate the switch’s  destination  lookup
                            on the chassis where logical_port is resident.

                     For the Ethernet address on a logical switch port of type
                     router, when that logical switch port’s addresses  column
                     is  set  to  router and the connected logical router port
                     specifies a reside-on-redirect-chassis  and  the  logical
                     router to which the connected logical router port belongs
                     to has a  redirect-chassis  distributed  gateway  logical
                     router port:

                     ·      The  flow  for the connected logical router port’s
                            Ethernet address is only programmed on  the  redi
                            rect-chassis.

                     For  each  forwarding  group  configured  on  the logical
                     switch datapath,  a  priority-50  flow  that  matches  on
                     eth.dst == VIP
                      with  an  action  of  fwd_group(childports=args ), where
                     args contains comma separated logical switch child  ports
                     to  load  balance to. If liveness is enabled, then action
                     also includes  liveness=true.

              ·      One priority-0 fallback flow that matches all packets and
                     outputs  them  to  the  MC_UNKNOWN multicast group, which
                     ovn-northd populates with all enabled logical ports  that
                     accept  unknown destination packets. As a small optimiza‐
                     tion, if no  logical  ports  accept  unknown  destination
                     packets,  ovn-northd omits this multicast group and logi‐
                     cal flow.

     Egress Table 0: Pre-LB

       This table is similar to ingress table Pre-LB. It contains a priority-0
       flow  that simply moves traffic to the next table. Moreover it contains
       a priority-110 flow to move IPv6 Neighbor Discovery traffic to the next
       table.  If  any  load  balancing rules exist for the datapath, a prior‐
       ity-100 flow is added with a match of ip and action  of  reg0[0]  =  1;
       next; to act as a hint for table Pre-stateful to send IP packets to the
       connection tracker for packet de-fragmentation.

       This table also has a priority-110 flow with the match eth.src == E for
       all logical switch datapaths to move traffic to the next table. Where E
       is the service monitor mac defined in the options:svc_monitor_mac colum
       of NB_Global table.

     Egress Table 1:to-lportPre-ACLs

       This is similar to ingress table Pre-ACLs except for to-lport traffic.

       This table also has a priority-110 flow with the match eth.src == E for
       all logical switch datapaths to move traffic to the next table. Where E
       is the service monitor mac defined in the options:svc_monitor_mac colum
       of NB_Global table.

     Egress Table 2: Pre-stateful

       This is similar to ingress table Pre-stateful.

     Egress Table 3: LB

       This is similar to ingress table LB.

     Egress Table 4:to-lportACLs

       This is similar to ingress table ACLs except for to-lport ACLs.

       In addition, the following flows are added.

              ·      A priority 34000 logical flow is added for  each  logical
                     port which has DHCPv4 options defined to allow the DHCPv4
                     reply packet and which  has  DHCPv6  options  defined  to
                     allow  the DHCPv6 reply packet from the Ingress Table 15:
                     DHCP responses.

              ·      A priority 34000 logical flow is added for  each  logical
                     switch  datapath  configured  with  DNS  records with the
                     match udp.dst = 53 to allow the DNS reply packet from the
                     Ingress Table 17: DNS responses.

              ·      A  priority  34000 logical flow is added for each logical
                     switch datapath with the match eth.src = E to  allow  the
                     service  monitor  request  packet  generated  by ovn-con
                     troller with the action next, where E is the service mon‐
                     itor  mac defined in the options:svc_monitor_mac colum of
                     NB_Global table.

     Egress Table 5:to-lportQoS Marking

       This is similar to ingress table  QoS  marking  except  they  apply  to
       to-lport QoS rules.

     Egress Table 6:to-lportQoS Meter

       This  is  similar  to  ingress  table  QoS  meter  except they apply to
       to-lport QoS rules.

     Egress Table 7: Stateful

       This is similar to ingress table Stateful  except  that  there  are  no
       rules added for load balancing new connections.

     Egress Table 8: Egress Port Security - IP

       This  is similar to the port security logic in table Ingress Port Secu
       rity - IP except that outport, eth.dst, ip4.dst and ip6.dst are checked
       instead of inport, eth.src, ip4.src and ip6.src

     Egress Table 9: Egress Port Security - L2

       This  is  similar  to  the ingress port security logic in ingress table
       Admission Control and Ingress Port Security - L2,  but  with  important
       differences. Most obviously, outport and eth.dst are checked instead of
       inport and eth.src. Second, packets directed to broadcast or  multicast
       eth.dst  are always accepted instead of being subject to the port secu‐
       rity rules; this  is  implemented  through  a  priority-100  flow  that
       matches on eth.mcast with action output;. Moreover, to ensure that even
       broadcast and multicast packets are not delivered to  disabled  logical
       ports,  a priority-150 flow for each disabled logical outport overrides
       the priority-100 flow with a drop; action. Finally if  egress  qos  has
       been  enabled  on a localnet port, the outgoing queue id is set through
       set_queue action. Please remember to mark  the  corresponding  physical
       interface with ovn-egress-iface set to true in external_ids

   Logical Router Datapaths
       Logical router datapaths will only exist for Logical_Router rows in the
       OVN_Northbound database that do not have enabled set to false

     Ingress Table 0: L2 Admission Control

       This table drops packets that the router shouldn’t see at all based  on
       their Ethernet headers. It contains the following flows:

              ·      Priority-100 flows to drop packets with VLAN tags or mul‐
                     ticast Ethernet source addresses.

              ·      For each enabled router port P with Ethernet address E, a
                     priority-50  flow  that matches inport == P &&&& (eth.mcast
                     || eth.dst == E), with action next;.

                     For the gateway port  on  a  distributed  logical  router
                     (where  one of the logical router ports specifies a redi
                     rect-chassis), the above flow matching eth.dst  ==  E  is
                     only programmed on the gateway port instance on the redi
                     rect-chassis.

              ·      For each dnat_and_snat NAT rule on a  distributed  router
                     that  specifies  an external Ethernet address E, a prior‐
                     ity-50 flow that matches inport == GW &&&&  eth.dst  ==  E,
                     where  GW is the logical router gateway port, with action
                     next;.

                     This flow is only programmed on the gateway port instance
                     on  the  chassis  where the logical_port specified in the
                     NAT rule resides.

       Other packets are implicitly dropped.

     Ingress Table 1: Neighbor lookup

       For ARP and IPv6 Neighbor Discovery packets, this table looks into  the
       MAC_Binding  records  to  determine if OVN needs to learn the mac bind‐
       ings. Following flows are added:

              ·      For each router port P that  owns  IP  address  A,  which
                     belongs  to subnet S with prefix length L, a priority-100
                     flow is added which matches inport == P &&&& arp.spa == S/L
                     &&&& arp.op == 1 (ARP request) with the following actions:

                     reg9[4] = lookup_arp(inport, arp.spa, arp.sha);
                     next;


                     If  the  logical  router  port P is a distributed gateway
                     router port, additional  match  is_chassis_resident(cr-P)
                     is added so that the resident gateway chassis handles the
                     neighbor lookup.

              ·      A priority-100 flow which matches on  ARP  reply  packets
                     and applies the actions:

                     reg9[4] = lookup_arp(inport, arp.spa, arp.sha);
                     next;


              ·      A  priority-100  flow which matches on IPv6 Neighbor Dis‐
                     covery advertisement packet and applies the actions:

                     reg9[4] = lookup_nd(inport, nd.target, nd.tll);
                     next;


              ·      A priority-100 flow which matches on IPv6  Neighbor  Dis‐
                     covery solicitation packet and applies the actions:

                     reg9[4] = lookup_nd(inport, ip6.src, nd.sll);
                     next;


              ·      A  priority-0  fallback flow that matches all packets and
                     applies the action  reg9[5]  =  1;  next;  advancing  the
                     packet to the next table.

     Ingress Table 2: Neighbor learning

       This  table  adds flows to learn the mac bindings from the ARP and IPv6
       Neighbor Solicitation/Advertisement packets if ARP/ND lookup failed  in
       the previous table.

       reg9[4] will be 1 if the lookup_arp/lookup_nd in the previous table was
       successful.

       reg9[5] will be 1 if there was no need to do the lookup.

              ·      A priority-100 flow  with  the  match  reg9[4]  ==  1  ||
                     reg9[5] == 1 and advances the packet to the next table as
                     there is no need to learn the neighbor.

              ·      A priority-90 flow with the match  arp  and  applies  the
                     action put_arp(inport, arp.spa, arp.sha); next;

              ·      A  priority-90  flow with the match nd_na and applies the
                     action put_nd(inport, nd.target, nd.tll); next;

              ·      A priority-90 flow with the match nd_ns and  applies  the
                     action put_nd(inport, ip6.src, nd.sll); next;

     Ingress Table 3: IP Input

       This table is the core of the logical router datapath functionality. It
       contains the following flows to implement very basic IP host  function‐
       ality.

              ·      For  each NAT entry of a distributed logical router (with
                     distributed gateway router port) of  type  snat,  a  pri‐
                     orirty-120  flow with the match inport == P &&&& ip4.src ==
                     A advances the packet to the next pipeline,  where  P  is
                     the  distributed  logical router port and A is the exter
                     nal_ip set in the NAT entry. If A  is  an  IPv6  address,
                     then ip6.src is used for the match.

                     The  above  flow is required to handle the routing of the
                     East/west NAT traffic.

              ·      L3 admission control: A priority-100 flow  drops  packets
                     that match any of the following:

                     ·      ip4.src[28..31] == 0xe (multicast source)

                     ·      ip4.src == 255.255.255.255 (broadcast source)

                     ·      ip4.src  ==  127.0.0.0/8 || ip4.dst == 127.0.0.0/8
                            (localhost source or destination)

                     ·      ip4.src == 0.0.0.0/8 || ip4.dst == 0.0.0.0/8 (zero
                            network source or destination)

                     ·      ip4.src  or ip6.src is any IP address owned by the
                            router, unless the packet was recirculated due  to
                            egress    loopback    as    indicated    by   REG
                            BIT_EGRESS_LOOPBACK.

                     ·      ip4.src is the broadcast address of any IP network
                            known to the router.

              ·      A priority-87 flow explicitly allows IPv6 multicast traf‐
                     fic that is supposed to reach the router pipeline  (e.g.,
                     neighbor  solicitations  and traffic destined to the All-
                     Routers multicast group).

              ·      A  priority-86  flow  allows  IP  multicast  traffic   if
                     options:mcast_relay=’true’, otherwise drops it.

              ·      ICMP  echo reply. These flows reply to ICMP echo requests
                     received for the router’s IP address.  Let  A  be  an  IP
                     address  owned by a router port. Then, for each A that is
                     an IPv4 address, a priority-90 flow matches on ip4.dst ==
                     A  and  icmp4.type  ==  8  &&&&  icmp4.code == 0 (ICMP echo
                     request). For each A that is an IPv6  address,  a  prior‐
                     ity-90 flow matches on ip6.dst == A and icmp6.type == 128
                     &&&& icmp6.code == 0 (ICMPv6 echo request). The port of the
                     router  that  receives  the echo request does not matter.
                     Also, the ip.ttl  of  the  echo  request  packet  is  not
                     checked,  so  it complies with RFC 1812, section 4.2.2.9.
                     Flows for ICMPv4 echo requests use the following actions:

                     ip4.dst ->gt;>gt; ip4.src;
                     ip.ttl = 255;
                     icmp4.type = 0;
                     flags.loopback = 1;
                     next;


                     Flows for ICMPv6 echo requests use the following actions:

                     ip6.dst ->gt;>gt; ip6.src;
                     ip.ttl = 255;
                     icmp6.type = 129;
                     flags.loopback = 1;
                     next;


              ·      Reply to ARP requests.

                     These flows reply to ARP requests for the router’s own IP
                     address.  The  ARP  requests  are  handled  only  if  the
                     requestor’s IP belongs to the same subnets of the logical
                     router  port. For each router port P that owns IP address
                     A, which belongs to subnet S with prefix  length  L,  and
                     Ethernet  address E, a priority-90 flow matches inport ==
                     P &&&& arp.spa == S/L &&&& arp.op == 1 &&&& arp.tpa ==  A  (ARP
                     request) with the following actions:

                     eth.dst = eth.src;
                     eth.src = E;
                     arp.op = 2; /* ARP reply. */
                     arp.tha = arp.sha;
                     arp.sha = E;
                     arp.tpa = arp.spa;
                     arp.spa = A;
                     outport = P;
                     flags.loopback = 1;
                     output;


                     For  the  gateway  port  on  a distributed logical router
                     (where one of the logical router ports specifies a  redi
                     rect-chassis), the above flows are only programmed on the
                     gateway  port  instance  on  the  redirect-chassis.  This
                     behavior avoids generation of multiple ARP responses from
                     different chassis, and allows upstream  MAC  learning  to
                     point to the redirect-chassis.

                     For the logical router port with the option reside-on-re
                     direct-chassis set  (which  is  centralized),  the  above
                     flows are only programmed on the gateway port instance on
                     the redirect-chassis (if the logical router  has  a  dis‐
                     tributed  gateway  port). This behavior avoids generation
                     of multiple ARP responses  from  different  chassis,  and
                     allows  upstream  MAC  learning  to  point  to  the redi
                     rect-chassis.

              ·      Reply to IPv6 Neighbor Solicitations. These  flows  reply
                     to  Neighbor  Solicitation  requests for the router’s own
                     IPv6 address and populate the logical router’s mac  bind‐
                     ing table.

                     For  each  router  port  P  that  owns  IPv6  address  A,
                     solicited node address S, and Ethernet address E, a  pri‐
                     ority-90  flow matches inport == P &&&& nd_ns &&&& ip6.dst ==
                     {A, E} &&&& nd.target == A with the following actions:

                     nd_na_router {
                         eth.src = E;
                         ip6.src = A;
                         nd.target = A;
                         nd.tll = E;
                         outport = inport;
                         flags.loopback = 1;
                         output;
                     };


                     For the gateway port  on  a  distributed  logical  router
                     (where  one of the logical router ports specifies a redi
                     rect-chassis), the above flows replying to IPv6  Neighbor
                     Solicitations  are  only  programmed  on the gateway port
                     instance on the redirect-chassis.  This  behavior  avoids
                     generation  of  multiple  replies from different chassis,
                     and allows upstream MAC learning to point  to  the  redi
                     rect-chassis.

              ·      These flows reply to ARP requests or IPv6 neighbor solic‐
                     itation for the virtual IP addresses  configured  in  the
                     router for DNAT or load balancing.

                     IPv4: For a configured DNAT IP address or a load balancer
                     IPv4 VIP A, for each router port P with Ethernet  address
                     E,  a priority-90 flow matches inport == P &&&& arp.op == 1
                     &&&& arp.tpa == A (ARP request) with the following actions:

                     eth.dst = eth.src;
                     eth.src = E;
                     arp.op = 2; /* ARP reply. */
                     arp.tha = arp.sha;
                     arp.sha = E;
                     arp.tpa = arp.spa;
                     arp.spa = A;
                     outport = P;
                     flags.loopback = 1;
                     output;


                     If the router port P  is  a  distributed  gateway  router
                     port,  then  the  is_chassis_resident(P) is also added in
                     the match condition for the load balancer IPv4 VIP A.

                     IPv6: For a configured DNAT IP address or a load balancer
                     IPv6  VIP  A,  solicited  node address S, for each router
                     port P  with  Ethernet  address  E,  a  priority-90  flow
                     matches  inport  ==  P  &&&&  nd_ns &&&& ip6.dst == {A, S} &&&&
                     nd.target == A with the following actions:

                     eth.dst = eth.src;
                     nd_na {
                         eth.src = E;
                         nd.tll = E;
                         ip6.src = A;
                         nd.target = A;
                         outport = P;
                         flags.loopback = 1;
                         output;
                     }


                     If the router port P  is  a  distributed  gateway  router
                     port,  then  the  is_chassis_resident(P) is also added in
                     the match condition for the load balancer IPv6 VIP A.

                     For the gateway port on a distributed logical router with
                     NAT  (where  one  of the logical router ports specifies a
                     redirect-chassis):

                     ·      If the corresponding NAT rule cannot be handled in
                            a  distributed manner, then this flow is only pro‐
                            grammed on the gateway port instance on the  redi
                            rect-chassis.  This  behavior avoids generation of
                            multiple ARP responses from different chassis, and
                            allows upstream MAC learning to point to the redi
                            rect-chassis.

                     ·      If the corresponding NAT rule can be handled in  a
                            distributed  manner,  then  this flow is only pro‐
                            grammed on the gateway  port  instance  where  the
                            logical_port specified in the NAT rule resides.

                            Some  of  the actions are different for this case,
                            using the external_mac specified in the  NAT  rule
                            rather than the gateway port’s Ethernet address E:

                            eth.src = external_mac;
                            arp.sha = external_mac;


                            or in the case of IPv6 neighbor solicition:

                            eth.src = external_mac;
                            nd.tll = external_mac;


                            This  behavior  avoids  generation of multiple ARP
                            responses  from  different  chassis,  and   allows
                            upstream  MAC  learning  to  point  to the correct
                            chassis.

              ·      Priority-85 flows which drops the ARP and  IPv6  Neighbor
                     Discovery packets.

              ·      UDP  port  unreachable.  Priority-80  flows generate ICMP
                     port unreachable  messages  in  reply  to  UDP  datagrams
                     directed  to  the router’s IP address, except in the spe‐
                     cial case of gateways, which accept traffic directed to a
                     router IP for load balancing and NAT purposes.

                     These  flows  should  not match IP fragments with nonzero
                     offset.

              ·      TCP reset. Priority-80 flows generate TCP reset  messages
                     in  reply  to  TCP  datagrams directed to the router’s IP
                     address, except in the special case  of  gateways,  which
                     accept traffic directed to a router IP for load balancing
                     and NAT purposes.

                     These flows should not match IP  fragments  with  nonzero
                     offset.

              ·      Protocol or address unreachable. Priority-70 flows gener‐
                     ate ICMP protocol or  address  unreachable  messages  for
                     IPv4  and  IPv6 respectively in reply to packets directed
                     to the router’s IP address on  IP  protocols  other  than
                     UDP,  TCP,  and ICMP, except in the special case of gate‐
                     ways, which accept traffic directed to a  router  IP  for
                     load balancing purposes.

                     These  flows  should  not match IP fragments with nonzero
                     offset.

              ·      Drop other IP traffic to this router.  These  flows  drop
                     any  other  traffic  destined  to  an  IP address of this
                     router that is not already handled by one  of  the  flows
                     above,  which  amounts to ICMP (other than echo requests)
                     and fragments with nonzero offsets. For each IP address A
                     owned  by  the router, a priority-60 flow matches ip4.dst
                     == A or ip6.dst == A and drops the traffic. An  exception
                     is  made  and  the  above flow is not added if the router
                     port’s own IP address is used  to  SNAT  packets  passing
                     through that router.

       The flows above handle all of the traffic that might be directed to the
       router itself. The following flows (with lower priorities)  handle  the
       remaining traffic, potentially for forwarding:

              ·      Drop  Ethernet  local  broadcast. A priority-50 flow with
                     match eth.bcast drops traffic destined to the local  Eth‐
                     ernet  broadcast  address.  By  definition  this  traffic
                     should not be forwarded.

              ·      ICMP time exceeded. For each  router  port  P,  whose  IP
                     address  is  A, a priority-40 flow with match inport == P
                     &&&& ip.ttl == {0, 1}  &&&&  !ip.later_frag  matches  packets
                     whose TTL has expired, with the following actions to send
                     an ICMP time exceeded reply for  IPv4  and  IPv6  respec‐
                     tively:

                     icmp4 {
                         icmp4.type = 11; /* Time exceeded. */
                         icmp4.code = 0;  /* TTL exceeded in transit. */
                         ip4.dst = ip4.src;
                         ip4.src = A;
                         ip.ttl = 255;
                         next;
                     };
                     icmp6 {
                         icmp6.type = 3; /* Time exceeded. */
                         icmp6.code = 0;  /* TTL exceeded in transit. */
                         ip6.dst = ip6.src;
                         ip6.src = A;
                         ip.ttl = 255;
                         next;
                     };


              ·      TTL  discard. A priority-30 flow with match ip.ttl == {0,
                     1} and actions drop; drops other packets  whose  TTL  has
                     expired, that should not receive a ICMP error reply (i.e.
                     fragments with nonzero offset).

              ·      Next table. A priority-0 flows  match  all  packets  that
                     aren’t  already  handled  and  uses actions next; to feed
                     them to the next table.

     Ingress Table 4: DEFRAG

       This is to send packets to connection tracker for tracking and  defrag‐
       mentation.  It  contains a priority-0 flow that simply moves traffic to
       the next table. If load balancing rules with virtual IP addresses  (and
       ports)  are configured in OVN_Northbound database for a Gateway router,
       a priority-100 flow is added for each  configured  virtual  IP  address
       VIP.  For  IPv4  VIPs  the  flow matches ip &&&& ip4.dst == VIP. For IPv6
       VIPs, the flow matches ip &&&& ip6.dst == VIP. The flow uses  the  action
       ct_next;  to  send  IP packets to the connection tracker for packet de-
       fragmentation and tracking before sending it to the next table.

     Ingress Table 5: UNSNAT

       This is for already established  connections’  reverse  traffic.  i.e.,
       SNAT  has  already  been done in egress pipeline and now the packet has
       entered the ingress pipeline as part of a reply. It is unSNATted here.

       Ingress Table 5: UNSNAT on Gateway Routers

              ·      If the Gateway router has been configured to  force  SNAT
                     any  previously DNATted packets to B, a priority-110 flow
                     matches ip &&&& ip4.dst == B or ip &&&& ip6.dst == B with  an
                     action ct_snat; .

                     If  the  Gateway router has been configured to force SNAT
                     any previously load-balanced packets to B, a priority-100
                     flow  matches  ip  &&&&  ip4.dst == B or ip &&&& ip6.dst == B
                     with an action ct_snat; .

                     For each NAT configuration in the  OVN  Northbound  data‐
                     base,  that  asks  to  change  the source IP address of a
                     packet from A to B, a  priority-90  flow  matches  ip  &&&&
                     ip4.dst  ==  B  or  ip  &&&&  ip6.dst  ==  B with an action
                     ct_snat; . If the NAT rule is of type  dnat_and_snat  and
                     has  stateless=true in the options, then the action would
                     be ip4/6.dst= (B).

                     A priority-0 logical flow with match 1 has actions next;.

       Ingress Table 5: UNSNAT on Distributed Routers

              ·      For each configuration in the  OVN  Northbound  database,
                     that  asks  to  change  the source IP address of a packet
                     from A to B, a priority-100 flow matches ip &&&& ip4.dst ==
                     B  &&&&  inport == GW or ip &&&& ip6.dst == B &&&& inport == GW
                     where GW is the logical  router  gateway  port,  with  an
                     action ct_snat;. If the NAT rule is of type dnat_and_snat
                     and has stateless=true in the options,  then  the  action
                     would be ip4/6.dst= (B).

                     If  the  NAT rule cannot be handled in a distributed man‐
                     ner, then the priority-100 flow above is only  programmed
                     on the redirect-chassis.

                     A priority-0 logical flow with match 1 has actions next;.

     Ingress Table 6: DNAT

       Packets enter the pipeline with destination IP address that needs to be
       DNATted from a virtual IP address to a real IP address. Packets in  the
       reverse direction needs to be unDNATed.

       Ingress Table 6: Load balancing DNAT rules

       Following  load  balancing  DNAT  flows are added for Gateway router or
       Router with gateway port. These flows are programmed only on the  redi
       rect-chassis. These flows do not get programmed for load balancers with
       IPv6 VIPs.

              ·      If controller_event has been enabled for all the  config‐
                     ured  load balancing rules for a Gateway router or Router
                     with gateway port in OVN_Northbound  database  that  does
                     not  have  configured  backends,  a  priority-130 flow is
                     added to trigger ovn-controller events whenever the chas‐
                     sis  receives  a  packet  for  that  particular  VIP.  If
                     event-elb meter has been previously created, it  will  be
                     associated to the empty_lb logical flow

              ·      For all the configured load balancing rules for a Gateway
                     router or Router  with  gateway  port  in  OVN_Northbound
                     database  that  includes a L4 port PORT of protocol P and
                     IPv4 or  IPv6  address  VIP,  a  priority-120  flow  that
                     matches  on  ct.new &&&& ip &&&& ip4.dst == VIP &&&& P &&&& P.dst
                     == PORT
                      (ip6.dst == VIP in the IPv6  case)  with  an  action  of
                     ct_lb(args),  where args contains comma separated IPv4 or
                     IPv6 addresses (and optional port numbers) to  load  bal‐
                     ance  to.  If  the router is configured to force SNAT any
                     load-balanced packets, the above action will be  replaced
                     by  flags.force_snat_for_lb  = 1; ct_lb(args);. If health
                     check is enabled, then args will only contain those  end‐
                     points  whose  service monitor status entry in OVN_South
                     bound db is either online or empty.

              ·      For all the configured load balancing rules for a  router
                     in  OVN_Northbound  database that includes a L4 port PORT
                     of protocol P and IPv4 or  IPv6  address  VIP,  a  prior‐
                     ity-120  flow  that matches on ct.est &&&& ip &&&& ip4.dst ==
                     VIP &&&& P &&&& P.dst == PORT
                      (ip6.dst == VIP in the IPv6  case)  with  an  action  of
                     ct_dnat;.  If  the router is configured to force SNAT any
                     load-balanced packets, the above action will be  replaced
                     by flags.force_snat_for_lb = 1; ct_dnat;.

              ·      For  all the configured load balancing rules for a router
                     in OVN_Northbound  database  that  includes  just  an  IP
                     address VIP to match on, a priority-110 flow that matches
                     on ct.new &&&& ip &&&& ip4.dst == VIP (ip6.dst == VIP in  the
                     IPv6 case) with an action of ct_lb(args), where args con‐
                     tains comma separated IPv4  or  IPv6  addresses.  If  the
                     router  is  configured  to  force  SNAT any load-balanced
                     packets,  the  above   action   will   be   replaced   by
                     flags.force_snat_for_lb = 1; ct_lb(args);.

              ·      For  all the configured load balancing rules for a router
                     in OVN_Northbound  database  that  includes  just  an  IP
                     address VIP to match on, a priority-110 flow that matches
                     on ct.est &&&& ip &&&& ip4.dst == VIP  (or  ip6.dst  ==  VIP)
                     with  an  action of ct_dnat;. If the router is configured
                     to force SNAT any load-balanced packets, the above action
                     will   be   replaced   by  flags.force_snat_for_lb  =  1;
                     ct_dnat;.

       Ingress Table 6: DNAT on Gateway Routers

              ·      For each configuration in the  OVN  Northbound  database,
                     that  asks  to  change  the  destination  IP address of a
                     packet from A to B, a priority-100  flow  matches  ip  &&&&
                     ip4.dst  ==  A  or  ip  &&&&  ip6.dst  ==  A with an action
                     flags.loopback = 1; ct_dnat(B);. If the Gateway router is
                     configured  to  force  SNAT  any DNATed packet, the above
                     action will be replaced by flags.force_snat_for_dnat = 1;
                     flags.loopback  =  1;  ct_dnat(B);. If the NAT rule is of
                     type dnat_and_snat and has stateless=true in the options,
                     then the action would be ip4/6.dst= (B).

              ·      For  all  IP  packets  of a Gateway router, a priority-50
                     flow with an action flags.loopback = 1; ct_dnat;.

              ·      A priority-0 logical flow with match 1 has actions next;.

       Ingress Table 6: DNAT on Distributed Routers

       On distributed routers, the DNAT table only handles packets with desti‐
       nation IP address that needs to be DNATted from a virtual IP address to
       a real IP address. The unDNAT processing in the  reverse  direction  is
       handled in a separate table in the egress pipeline.

              ·      For  each  configuration  in the OVN Northbound database,
                     that asks to change  the  destination  IP  address  of  a
                     packet  from  A  to  B, a priority-100 flow matches ip &&&&
                     ip4.dst == B &&&& inport == GW, where  GW  is  the  logical
                     router  gateway  port,  with  an  action ct_dnat(B);. The
                     match will include ip6.dst == B in the IPv6 case. If  the
                     NAT  rule is of type dnat_and_snat and has stateless=true
                     in the options, then the action would be ip4/6.dst=(B).

                     If the NAT rule cannot be handled in a  distributed  man‐
                     ner,  then the priority-100 flow above is only programmed
                     on the redirect-chassis.

                     A priority-0 logical flow with match 1 has actions next;.

     Ingress Table 7: IPv6 ND RA option processing

              ·      A priority-50 logical flow  is  added  for  each  logical
                     router  port  configured  with  IPv6  ND RA options which
                     matches IPv6 ND Router Solicitation  packet  and  applies
                     the  action put_nd_ra_opts and advances the packet to the
                     next table.

                     reg0[5] = put_nd_ra_opts(options);next;


                     For a valid IPv6 ND RS packet, this transforms the packet
                     into  an  IPv6 ND RA reply and sets the RA options to the
                     packet and stores 1 into  reg0[5].  For  other  kinds  of
                     packets,  it  just  stores 0 into reg0[5]. Either way, it
                     continues to the next table.

              ·      A priority-0 logical flow with match 1 has actions next;.

     Ingress Table 8: IPv6 ND RA responder

       This table implements IPv6 ND RA responder for the IPv6 ND  RA  replies
       generated by the previous table.

              ·      A  priority-50  logical  flow  is  added for each logical
                     router port configured with  IPv6  ND  RA  options  which
                     matches  IPv6 ND RA packets and reg0[5] == 1 and responds
                     back to the  inport  after  applying  these  actions.  If
                     reg0[5]   is   set   to  1,  it  means  that  the  action
                     put_nd_ra_opts was successful.

                     eth.dst = eth.src;
                     eth.src = E;
                     ip6.dst = ip6.src;
                     ip6.src = I;
                     outport = P;
                     flags.loopback = 1;
                     output;


                     where E is the MAC address and I is the IPv6  link  local
                     address of the logical router port.

                     (This  terminates  packet processing in ingress pipeline;
                     the packet does not go to the next ingress table.)

              ·      A priority-0 logical flow with match 1 has actions next;.

     Ingress Table 9: IP Routing

       A packet that arrives at this table is an  IP  packet  that  should  be
       routed  to  the address in ip4.dst or ip6.dst. This table implements IP
       routing, setting reg0 (or xxreg0 for IPv6) to the next-hop  IP  address
       (leaving ip4.dst or ip6.dst, the packet’s final destination, unchanged)
       and advances to the next table for ARP resolution. It  also  sets  reg1
       (or  xxreg1)  to  the  IP  address  owned  by  the selected router port
       (ingress table ARP Request will generate an  ARP  request,  if  needed,
       with  reg0 as the target protocol address and reg1 as the source proto‐
       col address).

       For ECMP routes, i.e. multiple static routes with same policy and  pre‐
       fix  but different nexthops, the above actions are deferred to next ta‐
       ble. This table, instead, is responsible for determine the  ECMP  group
       id and select a member id within the group based on 5-tuple hashing. It
       stores group id in reg8[0..15] and member id in reg8[16..31].

       This table contains the following logical flows:

              ·      Priority-500 flows that match IP multicast  traffic  des‐
                     tined  to  groups  registered  on  any  of  the  attached
                     switches and sets outport  to  the  associated  multicast
                     group  that  will  eventually  flood  the  traffic to all
                     interested attached  logical  switches.  The  flows  also
                     decrement TTL.

              ·      Priority-450  flow that matches unregistered IP multicast
                     traffic and  sets  outport  to  the  MC_STATIC  multicast
                     group,  which ovn-northd populates with the logical ports
                     that have options :mcast_flood=true.

              ·      For distributed logical routers where one of the  logical
                     router ports specifies a redirect-chassis, a priority-400
                     logical flow for each dnat_and_snat NAT rules configured.
                     These flows will allow to properly forward traffic to the
                     external connections if available and  avoid  sending  it
                     through  the  tunnel. Assuming the following NAT rule has
                     been configured:

                     external_ip = A;
                     external_mac = B;
                     logical_ip = C;


                     the following action will be applied:

                     ip.ttl--;
                     reg0 = ip.dst;
                     reg1 = A;
                     eth.src = B;
                     outport = router-port;
                     next;


              ·      IPv4 routing table. For each route to IPv4 network N with
                     netmask  M, on router port P with IP address A and Ether‐
                     net address E, a logical flow with match ip4.dst ==  N/M,
                     whose  priority is 400 + the number of 1-bits in M if the
                     router port is not a distributed gateway port,  else  the
                     priority  is the number of 1-bits in M, has the following
                     actions:

                     ip.ttl--;
                     reg8[0..15] = 0;
                     reg0 = G;
                     reg1 = A;
                     eth.src = E;
                     outport = P;
                     flags.loopback = 1;
                     next;


                     (Ingress table 1 already verified that ip.ttl--; will not
                     yield a TTL exceeded error.)

                     If  the route has a gateway, G is the gateway IP address.
                     Instead, if the route is from a configured static  route,
                     G is the next hop IP address. Else it is ip4.dst.

              ·      IPv6 routing table. For each route to IPv6 network N with
                     netmask M, on router port P with IP address A and  Ether‐
                     net address E, a logical flow with match in CIDR notation
                     ip6.dst == N/M, whose priority is the integer value of M,
                     has the following actions:

                     ip.ttl--;
                     reg8[0..15] = 0;
                     xxreg0 = G;
                     xxreg1 = A;
                     eth.src = E;
                     outport = P;
                     flags.loopback = 1;
                     next;


                     (Ingress table 1 already verified that ip.ttl--; will not
                     yield a TTL exceeded error.)

                     If the route has a gateway, G is the gateway IP  address.
                     Instead,  if the route is from a configured static route,
                     G is the next hop IP address. Else it is ip6.dst.

                     If the address A is in the link-local  scope,  the  route
                     will be limited to sending on the ingress port.

              ·      For  ECMP  routes, they are grouped by policy and prefix.
                     An unique id (non-zero) is assigned to  each  group,  and
                     each  member  is  also  assigned  an unique id (non-zero)
                     within each group.

                     For each IPv4/IPv6 ECMP group with group id GID and  mem‐
                     ber  ids  MID1,  MID2,  ..., a logical flow with match in
                     CIDR notation ip4.dst == N/M, or ip6.dst  ==  N/M,  whose
                     priority  is  the  integer  value of M, has the following
                     actions:

                     ip.ttl--;
                     flags.loopback = 1;
                     reg8[0..15] = GID;
                     select(reg8[16..31], MID1, MID2, ...);


     Ingress Table 10: IP_ROUTING_ECMP

       This table implements the second part of IP  routing  for  ECMP  routes
       following  the  previous table. If a packet matched a ECMP group in the
       previous table, this table matches the group id and  member  id  stored
       from the previous table, setting reg0 (or xxreg0 for IPv6) to the next-
       hop IP address (leaving ip4.dst or ip6.dst, the packet’s final destina‐
       tion,  unchanged) and advances to the next table for ARP resolution. It
       also sets reg1 (or xxreg1) to the IP  address  owned  by  the  selected
       router port (ingress table ARP Request will generate an ARP request, if
       needed, with reg0 as the target protocol address and reg1 as the source
       protocol address).

       This table contains the following logical flows:

              ·      A  priority-150  flow  that matches reg8[0..15] == 0 with
                     action  next;  directly  bypasses  packets  of   non-ECMP
                     routes.

              ·      For  each  member  with ID MID in each ECMP group with ID
                     GID, a priority-100 flow with match reg8[0..15] == GID &&&&
                     reg8[16..31] == MID has following actions:

                     [xx]reg0 = G;
                     [xx]reg1 = A;
                     eth.src = E;
                     outport = P;


     Ingress Table 12: ARP/ND Resolution

       Any  packet that reaches this table is an IP packet whose next-hop IPv4
       address is in reg0 or IPv6 address is in xxreg0.  (ip4.dst  or  ip6.dst
       contains  the final destination.) This table resolves the IP address in
       reg0 (or xxreg0) into an output port in outport and an Ethernet address
       in eth.dst, using the following flows:

              ·      A  priority-500  flow  that  matches IP multicast traffic
                     that was allowed in the routing pipeline. For  this  kind
                     of  traffic  the outport was already set so the flow just
                     advances to the next table.

              ·      Static MAC bindings. MAC bindings can be known statically
                     based  on data in the OVN_Northbound database. For router
                     ports connected to logical switches, MAC bindings can  be
                     known  statically  from the addresses column in the Logi
                     cal_Switch_Port table.  For  router  ports  connected  to
                     other  logical  routers, MAC bindings can be known stati‐
                     cally from the mac  and  networks  column  in  the  Logi
                     cal_Router_Port table.

                     For  each IPv4 address A whose host is known to have Eth‐
                     ernet address E on router port  P,  a  priority-100  flow
                     with match outport === P &&&& reg0 == A has actions eth.dst
                     = E; next;.

                     For each virtual ip A configured on  a  logical  port  of
                     type  virtual  and  its  virtual parent set in its corre‐
                     sponding Port_Binding record and the virtual parent  with
                     the  Ethernet  address  E and the virtual ip is reachable
                     via the router port P, a  priority-100  flow  with  match
                     outport  ===  P  &&&&  reg0  ==  A has actions eth.dst = E;
                     next;.

                     For each virtual ip A configured on  a  logical  port  of
                     type virtual and its virtual parent not set in its corre‐
                     sponding Port_Binding record and  the  virtual  ip  A  is
                     reachable via the router port P, a priority-100 flow with
                     match outport === P &&&& reg0 == A has  actions  eth.dst  =
                     00:00:00:00:00:00;  next;. This flow is added so that the
                     ARP is always resolved for the virtual ip A by generating
                     ARP  request  and not consulting the MAC_Binding table as
                     it can have incorrect value for the virtual ip A.

                     For each IPv6 address A whose host is known to have  Eth‐
                     ernet  address  E  on  router port P, a priority-100 flow
                     with match outport === P  &&&&  xxreg0  ==  A  has  actions
                     eth.dst = E; next;.

                     For each logical router port with an IPv4 address A and a
                     mac address of E that is reachable via a different  logi‐
                     cal router port P, a priority-100 flow with match outport
                     === P &&&& reg0 == A has actions eth.dst = E; next;.

                     For each logical router port with an IPv6 address A and a
                     mac  address of E that is reachable via a different logi‐
                     cal router port P, a priority-100 flow with match outport
                     === P &&&& xxreg0 == A has actions eth.dst = E; next;.

              ·      Static  MAC  bindings  from NAT entries. MAC bindings can
                     also be known for the entries in  the  NAT  table.  Below
                     flows  are programmed for distributed logical routers i.e
                     with a distributed router port.

                     For each row in the NAT table with IPv4 address A in  the
                     external_ip column of NAT table, a priority-100 flow with
                     the match outport === P &&&& reg0 == A has actions  eth.dst
                     =  E;  next;,  where  P is the distributed logical router
                     port, E is the Ethernet address  if  set  in  the  exter
                     nal_mac  column  of  NAT table for of type dnat_and_snat,
                     otherwise the Ethernet address of the distributed logical
                     router port.

                     For IPv6 NAT entries, same flows are added, but using the
                     register xxreg0 for the match.

              ·      Dynamic MAC bindings. These flows resolve MAC-to-IP bind‐
                     ings  that  have  become known dynamically through ARP or
                     neighbor discovery. (The ingress table ARP  Request  will
                     issue  an  ARP or neighbor solicitation request for cases
                     where the binding is not yet known.)

                     A priority-0 logical flow  with  match  ip4  has  actions
                     get_arp(outport, reg0); next;.

                     A  priority-0  logical  flow  with  match ip6 has actions
                     get_nd(outport, xxreg0); next;.

              ·      For logical router port with redirect-chassis  and  redi‐
                     rect-type  being  set as bridged, a priority-50 flow will
                     match outport == "ROUTER_PORT"  and  !is_chassis_resident
                     ("cr-ROUTER_PORT")  has actions eth.dst = E; next;, where
                     E is the ethernet address of the logical router port.

     Ingress Table 13: Check packet length

       For distributed logical routers with distributed gateway  port  config‐
       ured with options:gateway_mtu to a valid integer value, this table adds
       a priority-50 logical flow with the match ip4  &&&&  outport  ==  GW_PORT
       where  GW_PORT  is  the distributed gateway router port and applies the
       action check_pkt_larger and advances the packet to the next table.

       REGBIT_PKT_LARGER = check_pkt_larger(L); next;


       where L is the packet length to check for. If the packet is larger than
       L, it stores 1 in the register bit REGBIT_PKT_LARGER. The value of L is
       taken from options:gateway_mtu column of Logical_Router_Port row.

       This table adds one priority-0 fallback flow that matches  all  packets
       and advances to the next table.

     Ingress Table 14: Handle larger packets

       For  distributed  logical routers with distributed gateway port config‐
       ured with options:gateway_mtu to a valid integer value, this table adds
       the  following  priority-50  logical  flow for each logical router port
       with the match ip4 &&&& inport == LRP  &&&&  outport  ==  GW_PORT  &&&&  REG
       BIT_PKT_LARGER, where LRP is the logical router port and GW_PORT is the
       distributed gateway router port and applies the following action

       icmp4 {
           icmp4.type = 3; /* Destination Unreachable. */
           icmp4.code = 4;  /* Frag Needed and DF was Set. */
           icmp4.frag_mtu = M;
           eth.dst = E;
           ip4.dst = ip4.src;
           ip4.src = I;
           ip.ttl = 255;
           REGBIT_EGRESS_LOOPBACK = 1;
           next(pipeline=ingress, table=0);
       };


              ·      Where M is the (fragment MTU - 58) whose value  is  taken
                     from  options:gateway_mtu  column  of Logical_Router_Port
                     row.

              ·      E is the Ethernet address of the logical router port.

              ·      I is the IPv4 address of the logical router port.

       This table adds one priority-0 fallback flow that matches  all  packets
       and advances to the next table.

     Ingress Table 15: Gateway Redirect

       For  distributed  logical routers where one of the logical router ports
       specifies a redirect-chassis, this table redirects certain  packets  to
       the distributed gateway port instance on the redirect-chassis. This ta‐
       ble has the following flows:

              ·      A priority-150 logical flow with match outport ==  GW  &&&&
                     eth.dst  ==  00:00:00:00:00:00  has actions outport = CR;
                     next;, where GW is the logical router distributed gateway
                     port  and CR is the chassisredirect port representing the
                     instance of the logical router distributed  gateway  port
                     on the redirect-chassis.

              ·      For each NAT rule in the OVN Northbound database that can
                     be handled in a distributed manner, a priority-200  logi‐
                     cal  flow with match ip4.src == B &&&& outport == GW, where
                     GW is the logical router distributed gateway  port,  with
                     actions next;.

              ·      A  priority-50  logical flow with match outport == GW has
                     actions outport = CR; next;,  where  GW  is  the  logical
                     router  distributed  gateway  port  and  CR  is the chas
                     sisredirect port representing the instance of the logical
                     router distributed gateway port on the redirect-chassis.

              ·      A priority-0 logical flow with match 1 has actions next;.

     Ingress Table 16: ARP Request

       In  the  common  case where the Ethernet destination has been resolved,
       this table outputs the packet. Otherwise, it composes and sends an  ARP
       or IPv6 Neighbor Solicitation request. It holds the following flows:

              ·      Unknown MAC address. A priority-100 flow for IPv4 packets
                     with match eth.dst == 00:00:00:00:00:00 has the following
                     actions:

                     arp {
                         eth.dst = ff:ff:ff:ff:ff:ff;
                         arp.spa = reg1;
                         arp.tpa = reg0;
                         arp.op = 1;  /* ARP request. */
                         output;
                     };


                     Unknown  MAC  address. For each IPv6 static route associ‐
                     ated with the router with the nexthop  IP:  G,  a  prior‐
                     ity-200  flow  for  IPv6  packets  with  match eth.dst ==
                     00:00:00:00:00:00 &&&&  xxreg0  ==  G  with  the  following
                     actions is added:

                     nd_ns {
                         eth.dst = E;
                         ip6.dst = I
                         nd.target = G;
                         output;
                     };


                     Where E is the multicast mac derived from the Gateway IP,
                     I is the solicited-node multicast  address  corresponding
                     to the target address G.

                     Unknown MAC address. A priority-100 flow for IPv6 packets
                     with match eth.dst == 00:00:00:00:00:00 has the following
                     actions:

                     nd_ns {
                         nd.target = xxreg0;
                         output;
                     };


                     (Ingress  table  IP  Routing initialized reg1 with the IP
                     address owned by outport and (xx)reg0 with  the  next-hop
                     IP address)

                     The  IP  packet  that triggers the ARP/IPv6 NS request is
                     dropped.

              ·      Known MAC address. A priority-0 flow  with  match  1  has
                     actions output;.

     Egress Table 0: UNDNAT

       This  is  for  already  established connections’ reverse traffic. i.e.,
       DNAT has already been done in ingress pipeline and now the  packet  has
       entered  the  egress pipeline as part of a reply. For NAT on a distrib‐
       uted router, it is unDNATted here. For Gateway routers, the unDNAT pro‐
       cessing is carried out in the ingress DNAT table.

              ·      For  all the configured load balancing rules for a router
                     with  gateway  port  in  OVN_Northbound   database   that
                     includes  an  IPv4  address  VIP,  for every backend IPv4
                     address B defined for the VIP a priority-120 flow is pro‐
                     grammed on redirect-chassis that matches ip &&&& ip4.src ==
                     B &&&& outport == GW, where GW is the logical router  gate‐
                     way  port  with  an  action ct_dnat;. If the backend IPv4
                     address B is also configured with L4 port PORT of  proto‐
                     col  P, then the match also includes P.src == PORT. These
                     flows are not added for load balancers with IPv6 VIPs.

                     If the router is configured to force SNAT  any  load-bal‐
                     anced   packets,   above   action  will  be  replaced  by
                     flags.force_snat_for_lb = 1; ct_dnat;.

              ·      For each configuration in  the  OVN  Northbound  database
                     that  asks  to  change  the  destination  IP address of a
                     packet from an IP address of A to B, a priority-100  flow
                     matches  ip &&&& ip4.src == B &&&& outport == GW, where GW is
                     the logical router gateway port, with an action ct_dnat;.
                     If  the  NAT rule is of type dnat_and_snat and has state
                     less=true in  the  options,  then  the  action  would  be
                     ip4/6.src= (B).

                     If  the  NAT rule cannot be handled in a distributed man‐
                     ner, then the priority-100 flow above is only  programmed
                     on the redirect-chassis.

                     If  the  NAT rule can be handled in a distributed manner,
                     then there is an additional action eth.src =  EA;,  where
                     EA is the ethernet address associated with the IP address
                     A in the NAT rule. This allows upstream MAC  learning  to
                     point to the correct chassis.

              ·      A priority-0 logical flow with match 1 has actions next;.

     Egress Table 1: SNAT

       Packets  that  are  configured to be SNATed get their source IP address
       changed based on the configuration in the OVN Northbound database.

              ·      A priority-120 flow to advance the IPv6 Neighbor  solici‐
                     tation  packet  to  next  table to skip SNAT. In the case
                     where ovn-controller injects an IPv6  Neighbor  Solicita‐
                     tion  packet  (for nd_ns action) we don’t want the packet
                     to go throught conntrack.

       Egress Table 1: SNAT on Gateway Routers

              ·      If the Gateway router in the OVN Northbound database  has
                     been  configured  to  force  SNAT a packet (that has been
                     previously DNATted) to B,  a  priority-100  flow  matches
                     flags.force_snat_for_dnat  ==  1  &&&&  ip  with  an action
                     ct_snat(B);.

                     If the Gateway router in the OVN Northbound database  has
                     been  configured  to  force  SNAT a packet (that has been
                     previously  load-balanced)  to  B,  a  priority-100  flow
                     matches flags.force_snat_for_lb == 1 &&&& ip with an action
                     ct_snat(B);.

                     For each configuration in the  OVN  Northbound  database,
                     that  asks  to  change  the source IP address of a packet
                     from an IP address of  A  or  to  change  the  source  IP
                     address  of  a  packet  that belongs to network A to B, a
                     flow  matches  ip  &&&&  ip4.src  ==  A  with   an   action
                     ct_snat(B);. The priority of the flow is calculated based
                     on the mask of A, with matches having larger  masks  get‐
                     ting  higher  priorities.  If  the  NAT  rule  is of type
                     dnat_and_snat and has stateless=true in the options, then
                     the action would be ip4/6.src= (B).

                     A priority-0 logical flow with match 1 has actions next;.

       Egress Table 1: SNAT on Distributed Routers

              ·      For  each  configuration  in the OVN Northbound database,
                     that asks to change the source IP  address  of  a  packet
                     from  an  IP  address  of  A  or  to change the source IP
                     address of a packet that belongs to network  A  to  B,  a
                     flow  matches  ip &&&& ip4.src == A &&&& outport == GW, where
                     GW is the logical router gateway  port,  with  an  action
                     ct_snat(B);. The priority of the flow is calculated based
                     on the mask of A, with matches having larger  masks  get‐
                     ting  higher  priorities.  If  the  NAT  rule  is of type
                     dnat_and_snat and has stateless=true in the options, then
                     the action would be ip4/6.src= (B).

                     If  the  NAT rule cannot be handled in a distributed man‐
                     ner, then the flow above is only programmed on the  redi
                     rect-chassis  increasing flow priority by 128 in order to
                     be run first

                     If the NAT rule can be handled in a  distributed  manner,
                     then  there  is an additional action eth.src = EA;, where
                     EA is the ethernet address associated with the IP address
                     A  in  the NAT rule. This allows upstream MAC learning to
                     point to the correct chassis.

              ·      A priority-0 logical flow with match 1 has actions next;.

     Egress Table 2: Egress Loopback

       For distributed logical routers where one of the logical  router  ports
       specifies a redirect-chassis.

       While  UNDNAT  and SNAT processing have already occurred by this point,
       this traffic needs to be forced through egress loopback  on  this  dis‐
       tributed gateway port instance, in order for UNSNAT and DNAT processing
       to be applied, and also for IP routing and ARP resolution after all  of
       the NAT processing, so that the packet can be forwarded to the destina‐
       tion.

       This table has the following flows:

              ·      For each NAT rule in the OVN  Northbound  database  on  a
                     distributed  router,  a  priority-100  logical  flow with
                     match ip4.dst == E &&&& outport == GW  &&&&  is_chassis_resi
                     dent(P),  where E is the external IP address specified in
                     the NAT rule, GW is the logical router distributed  gate‐
                     way  port.  For  dnat_and_snat NAT rule, P is the logical
                     port specified in the NAT rule. If logical_port column of
                     NAT  table is NOT set, then P is the chassisredirect port
                     of GW with the following actions:

                     clone {
                         ct_clear;
                         inport = outport;
                         outport = "";
                         flags = 0;
                         flags.loopback = 1;
                         reg0 = 0;
                         reg1 = 0;
                         ...
                         reg9 = 0;
                         REGBIT_EGRESS_LOOPBACK = 1;
                         next(pipeline=ingress, table=0);
                     };


                     flags.loopback is set since in_port is unchanged and  the
                     packet may return back to that port after NAT processing.
                     REGBIT_EGRESS_LOOPBACK is set  to  indicate  that  egress
                     loopback  has  occurred,  in  order to skip the source IP
                     address check against the router address.

              ·      A priority-0 logical flow with match 1 has actions next;.

     Egress Table 3: Delivery

       Packets that reach this table are ready for delivery. It contains:

              ·      Priority-110 logical flows that match IP multicast  pack‐
                     ets  on  each  enabled logical router port and modify the
                     Ethernet source address of the packets  to  the  Ethernet
                     address of the port and then execute action output;.

              ·      Priority-100  logical  flows  that  match packets on each
                     enabled logical router port, with action output;.



Open vSwitch 20.03.90             ovn-northd                     ovn-northd(8)