David Brazdil | 0f672f6 | 2019-12-10 10:32:29 +0000 | [diff] [blame^] | 1 | ============ |
| 2 | Architecture |
| 3 | ============ |
| 4 | |
| 5 | This document describes the **Distributed Switch Architecture (DSA)** subsystem |
| 6 | design principles, limitations, interactions with other subsystems, and how to |
| 7 | develop drivers for this subsystem as well as a TODO for developers interested |
| 8 | in joining the effort. |
| 9 | |
| 10 | Design principles |
| 11 | ================= |
| 12 | |
| 13 | The Distributed Switch Architecture is a subsystem which was primarily designed |
| 14 | to support Marvell Ethernet switches (MV88E6xxx, a.k.a Linkstreet product line) |
| 15 | using Linux, but has since evolved to support other vendors as well. |
| 16 | |
| 17 | The original philosophy behind this design was to be able to use unmodified |
| 18 | Linux tools such as bridge, iproute2, ifconfig to work transparently whether |
| 19 | they configured/queried a switch port network device or a regular network |
| 20 | device. |
| 21 | |
| 22 | An Ethernet switch is typically comprised of multiple front-panel ports, and one |
| 23 | or more CPU or management port. The DSA subsystem currently relies on the |
| 24 | presence of a management port connected to an Ethernet controller capable of |
| 25 | receiving Ethernet frames from the switch. This is a very common setup for all |
| 26 | kinds of Ethernet switches found in Small Home and Office products: routers, |
| 27 | gateways, or even top-of-the rack switches. This host Ethernet controller will |
| 28 | be later referred to as "master" and "cpu" in DSA terminology and code. |
| 29 | |
| 30 | The D in DSA stands for Distributed, because the subsystem has been designed |
| 31 | with the ability to configure and manage cascaded switches on top of each other |
| 32 | using upstream and downstream Ethernet links between switches. These specific |
| 33 | ports are referred to as "dsa" ports in DSA terminology and code. A collection |
| 34 | of multiple switches connected to each other is called a "switch tree". |
| 35 | |
| 36 | For each front-panel port, DSA will create specialized network devices which are |
| 37 | used as controlling and data-flowing endpoints for use by the Linux networking |
| 38 | stack. These specialized network interfaces are referred to as "slave" network |
| 39 | interfaces in DSA terminology and code. |
| 40 | |
| 41 | The ideal case for using DSA is when an Ethernet switch supports a "switch tag" |
| 42 | which is a hardware feature making the switch insert a specific tag for each |
| 43 | Ethernet frames it received to/from specific ports to help the management |
| 44 | interface figure out: |
| 45 | |
| 46 | - what port is this frame coming from |
| 47 | - what was the reason why this frame got forwarded |
| 48 | - how to send CPU originated traffic to specific ports |
| 49 | |
| 50 | The subsystem does support switches not capable of inserting/stripping tags, but |
| 51 | the features might be slightly limited in that case (traffic separation relies |
| 52 | on Port-based VLAN IDs). |
| 53 | |
| 54 | Note that DSA does not currently create network interfaces for the "cpu" and |
| 55 | "dsa" ports because: |
| 56 | |
| 57 | - the "cpu" port is the Ethernet switch facing side of the management |
| 58 | controller, and as such, would create a duplication of feature, since you |
| 59 | would get two interfaces for the same conduit: master netdev, and "cpu" netdev |
| 60 | |
| 61 | - the "dsa" port(s) are just conduits between two or more switches, and as such |
| 62 | cannot really be used as proper network interfaces either, only the |
| 63 | downstream, or the top-most upstream interface makes sense with that model |
| 64 | |
| 65 | Switch tagging protocols |
| 66 | ------------------------ |
| 67 | |
| 68 | DSA currently supports 5 different tagging protocols, and a tag-less mode as |
| 69 | well. The different protocols are implemented in: |
| 70 | |
| 71 | - ``net/dsa/tag_trailer.c``: Marvell's 4 trailer tag mode (legacy) |
| 72 | - ``net/dsa/tag_dsa.c``: Marvell's original DSA tag |
| 73 | - ``net/dsa/tag_edsa.c``: Marvell's enhanced DSA tag |
| 74 | - ``net/dsa/tag_brcm.c``: Broadcom's 4 bytes tag |
| 75 | - ``net/dsa/tag_qca.c``: Qualcomm's 2 bytes tag |
| 76 | |
| 77 | The exact format of the tag protocol is vendor specific, but in general, they |
| 78 | all contain something which: |
| 79 | |
| 80 | - identifies which port the Ethernet frame came from/should be sent to |
| 81 | - provides a reason why this frame was forwarded to the management interface |
| 82 | |
| 83 | Master network devices |
| 84 | ---------------------- |
| 85 | |
| 86 | Master network devices are regular, unmodified Linux network device drivers for |
| 87 | the CPU/management Ethernet interface. Such a driver might occasionally need to |
| 88 | know whether DSA is enabled (e.g.: to enable/disable specific offload features), |
| 89 | but the DSA subsystem has been proven to work with industry standard drivers: |
| 90 | ``e1000e,`` ``mv643xx_eth`` etc. without having to introduce modifications to these |
| 91 | drivers. Such network devices are also often referred to as conduit network |
| 92 | devices since they act as a pipe between the host processor and the hardware |
| 93 | Ethernet switch. |
| 94 | |
| 95 | Networking stack hooks |
| 96 | ---------------------- |
| 97 | |
| 98 | When a master netdev is used with DSA, a small hook is placed in in the |
| 99 | networking stack is in order to have the DSA subsystem process the Ethernet |
| 100 | switch specific tagging protocol. DSA accomplishes this by registering a |
| 101 | specific (and fake) Ethernet type (later becoming ``skb->protocol``) with the |
| 102 | networking stack, this is also known as a ``ptype`` or ``packet_type``. A typical |
| 103 | Ethernet Frame receive sequence looks like this: |
| 104 | |
| 105 | Master network device (e.g.: e1000e): |
| 106 | |
| 107 | 1. Receive interrupt fires: |
| 108 | |
| 109 | - receive function is invoked |
| 110 | - basic packet processing is done: getting length, status etc. |
| 111 | - packet is prepared to be processed by the Ethernet layer by calling |
| 112 | ``eth_type_trans`` |
| 113 | |
| 114 | 2. net/ethernet/eth.c:: |
| 115 | |
| 116 | eth_type_trans(skb, dev) |
| 117 | if (dev->dsa_ptr != NULL) |
| 118 | -> skb->protocol = ETH_P_XDSA |
| 119 | |
| 120 | 3. drivers/net/ethernet/\*:: |
| 121 | |
| 122 | netif_receive_skb(skb) |
| 123 | -> iterate over registered packet_type |
| 124 | -> invoke handler for ETH_P_XDSA, calls dsa_switch_rcv() |
| 125 | |
| 126 | 4. net/dsa/dsa.c:: |
| 127 | |
| 128 | -> dsa_switch_rcv() |
| 129 | -> invoke switch tag specific protocol handler in 'net/dsa/tag_*.c' |
| 130 | |
| 131 | 5. net/dsa/tag_*.c: |
| 132 | |
| 133 | - inspect and strip switch tag protocol to determine originating port |
| 134 | - locate per-port network device |
| 135 | - invoke ``eth_type_trans()`` with the DSA slave network device |
| 136 | - invoked ``netif_receive_skb()`` |
| 137 | |
| 138 | Past this point, the DSA slave network devices get delivered regular Ethernet |
| 139 | frames that can be processed by the networking stack. |
| 140 | |
| 141 | Slave network devices |
| 142 | --------------------- |
| 143 | |
| 144 | Slave network devices created by DSA are stacked on top of their master network |
| 145 | device, each of these network interfaces will be responsible for being a |
| 146 | controlling and data-flowing end-point for each front-panel port of the switch. |
| 147 | These interfaces are specialized in order to: |
| 148 | |
| 149 | - insert/remove the switch tag protocol (if it exists) when sending traffic |
| 150 | to/from specific switch ports |
| 151 | - query the switch for ethtool operations: statistics, link state, |
| 152 | Wake-on-LAN, register dumps... |
| 153 | - external/internal PHY management: link, auto-negotiation etc. |
| 154 | |
| 155 | These slave network devices have custom net_device_ops and ethtool_ops function |
| 156 | pointers which allow DSA to introduce a level of layering between the networking |
| 157 | stack/ethtool, and the switch driver implementation. |
| 158 | |
| 159 | Upon frame transmission from these slave network devices, DSA will look up which |
| 160 | switch tagging protocol is currently registered with these network devices, and |
| 161 | invoke a specific transmit routine which takes care of adding the relevant |
| 162 | switch tag in the Ethernet frames. |
| 163 | |
| 164 | These frames are then queued for transmission using the master network device |
| 165 | ``ndo_start_xmit()`` function, since they contain the appropriate switch tag, the |
| 166 | Ethernet switch will be able to process these incoming frames from the |
| 167 | management interface and delivers these frames to the physical switch port. |
| 168 | |
| 169 | Graphical representation |
| 170 | ------------------------ |
| 171 | |
| 172 | Summarized, this is basically how DSA looks like from a network device |
| 173 | perspective:: |
| 174 | |
| 175 | |
| 176 | |--------------------------- |
| 177 | | CPU network device (eth0)| |
| 178 | ---------------------------- |
| 179 | | <tag added by switch | |
| 180 | | | |
| 181 | | | |
| 182 | | tag added by CPU> | |
| 183 | |--------------------------------------------| |
| 184 | | Switch driver | |
| 185 | |--------------------------------------------| |
| 186 | || || || |
| 187 | |-------| |-------| |-------| |
| 188 | | sw0p0 | | sw0p1 | | sw0p2 | |
| 189 | |-------| |-------| |-------| |
| 190 | |
| 191 | |
| 192 | |
| 193 | Slave MDIO bus |
| 194 | -------------- |
| 195 | |
| 196 | In order to be able to read to/from a switch PHY built into it, DSA creates a |
| 197 | slave MDIO bus which allows a specific switch driver to divert and intercept |
| 198 | MDIO reads/writes towards specific PHY addresses. In most MDIO-connected |
| 199 | switches, these functions would utilize direct or indirect PHY addressing mode |
| 200 | to return standard MII registers from the switch builtin PHYs, allowing the PHY |
| 201 | library and/or to return link status, link partner pages, auto-negotiation |
| 202 | results etc.. |
| 203 | |
| 204 | For Ethernet switches which have both external and internal MDIO busses, the |
| 205 | slave MII bus can be utilized to mux/demux MDIO reads and writes towards either |
| 206 | internal or external MDIO devices this switch might be connected to: internal |
| 207 | PHYs, external PHYs, or even external switches. |
| 208 | |
| 209 | Data structures |
| 210 | --------------- |
| 211 | |
| 212 | DSA data structures are defined in ``include/net/dsa.h`` as well as |
| 213 | ``net/dsa/dsa_priv.h``: |
| 214 | |
| 215 | - ``dsa_chip_data``: platform data configuration for a given switch device, |
| 216 | this structure describes a switch device's parent device, its address, as |
| 217 | well as various properties of its ports: names/labels, and finally a routing |
| 218 | table indication (when cascading switches) |
| 219 | |
| 220 | - ``dsa_platform_data``: platform device configuration data which can reference |
| 221 | a collection of dsa_chip_data structure if multiples switches are cascaded, |
| 222 | the master network device this switch tree is attached to needs to be |
| 223 | referenced |
| 224 | |
| 225 | - ``dsa_switch_tree``: structure assigned to the master network device under |
| 226 | ``dsa_ptr``, this structure references a dsa_platform_data structure as well as |
| 227 | the tagging protocol supported by the switch tree, and which receive/transmit |
| 228 | function hooks should be invoked, information about the directly attached |
| 229 | switch is also provided: CPU port. Finally, a collection of dsa_switch are |
| 230 | referenced to address individual switches in the tree. |
| 231 | |
| 232 | - ``dsa_switch``: structure describing a switch device in the tree, referencing |
| 233 | a ``dsa_switch_tree`` as a backpointer, slave network devices, master network |
| 234 | device, and a reference to the backing``dsa_switch_ops`` |
| 235 | |
| 236 | - ``dsa_switch_ops``: structure referencing function pointers, see below for a |
| 237 | full description. |
| 238 | |
| 239 | Design limitations |
| 240 | ================== |
| 241 | |
| 242 | Limits on the number of devices and ports |
| 243 | ----------------------------------------- |
| 244 | |
| 245 | DSA currently limits the number of maximum switches within a tree to 4 |
| 246 | (``DSA_MAX_SWITCHES``), and the number of ports per switch to 12 (``DSA_MAX_PORTS``). |
| 247 | These limits could be extended to support larger configurations would this need |
| 248 | arise. |
| 249 | |
| 250 | Lack of CPU/DSA network devices |
| 251 | ------------------------------- |
| 252 | |
| 253 | DSA does not currently create slave network devices for the CPU or DSA ports, as |
| 254 | described before. This might be an issue in the following cases: |
| 255 | |
| 256 | - inability to fetch switch CPU port statistics counters using ethtool, which |
| 257 | can make it harder to debug MDIO switch connected using xMII interfaces |
| 258 | |
| 259 | - inability to configure the CPU port link parameters based on the Ethernet |
| 260 | controller capabilities attached to it: http://patchwork.ozlabs.org/patch/509806/ |
| 261 | |
| 262 | - inability to configure specific VLAN IDs / trunking VLANs between switches |
| 263 | when using a cascaded setup |
| 264 | |
| 265 | Common pitfalls using DSA setups |
| 266 | -------------------------------- |
| 267 | |
| 268 | Once a master network device is configured to use DSA (dev->dsa_ptr becomes |
| 269 | non-NULL), and the switch behind it expects a tagging protocol, this network |
| 270 | interface can only exclusively be used as a conduit interface. Sending packets |
| 271 | directly through this interface (e.g.: opening a socket using this interface) |
| 272 | will not make us go through the switch tagging protocol transmit function, so |
| 273 | the Ethernet switch on the other end, expecting a tag will typically drop this |
| 274 | frame. |
| 275 | |
| 276 | Slave network devices check that the master network device is UP before allowing |
| 277 | you to administratively bring UP these slave network devices. A common |
| 278 | configuration mistake is forgetting to bring UP the master network device first. |
| 279 | |
| 280 | Interactions with other subsystems |
| 281 | ================================== |
| 282 | |
| 283 | DSA currently leverages the following subsystems: |
| 284 | |
| 285 | - MDIO/PHY library: ``drivers/net/phy/phy.c``, ``mdio_bus.c`` |
| 286 | - Switchdev:``net/switchdev/*`` |
| 287 | - Device Tree for various of_* functions |
| 288 | |
| 289 | MDIO/PHY library |
| 290 | ---------------- |
| 291 | |
| 292 | Slave network devices exposed by DSA may or may not be interfacing with PHY |
| 293 | devices (``struct phy_device`` as defined in ``include/linux/phy.h)``, but the DSA |
| 294 | subsystem deals with all possible combinations: |
| 295 | |
| 296 | - internal PHY devices, built into the Ethernet switch hardware |
| 297 | - external PHY devices, connected via an internal or external MDIO bus |
| 298 | - internal PHY devices, connected via an internal MDIO bus |
| 299 | - special, non-autonegotiated or non MDIO-managed PHY devices: SFPs, MoCA; a.k.a |
| 300 | fixed PHYs |
| 301 | |
| 302 | The PHY configuration is done by the ``dsa_slave_phy_setup()`` function and the |
| 303 | logic basically looks like this: |
| 304 | |
| 305 | - if Device Tree is used, the PHY device is looked up using the standard |
| 306 | "phy-handle" property, if found, this PHY device is created and registered |
| 307 | using ``of_phy_connect()`` |
| 308 | |
| 309 | - if Device Tree is used, and the PHY device is "fixed", that is, conforms to |
| 310 | the definition of a non-MDIO managed PHY as defined in |
| 311 | ``Documentation/devicetree/bindings/net/fixed-link.txt``, the PHY is registered |
| 312 | and connected transparently using the special fixed MDIO bus driver |
| 313 | |
| 314 | - finally, if the PHY is built into the switch, as is very common with |
| 315 | standalone switch packages, the PHY is probed using the slave MII bus created |
| 316 | by DSA |
| 317 | |
| 318 | |
| 319 | SWITCHDEV |
| 320 | --------- |
| 321 | |
| 322 | DSA directly utilizes SWITCHDEV when interfacing with the bridge layer, and |
| 323 | more specifically with its VLAN filtering portion when configuring VLANs on top |
| 324 | of per-port slave network devices. Since DSA primarily deals with |
| 325 | MDIO-connected switches, although not exclusively, SWITCHDEV's |
| 326 | prepare/abort/commit phases are often simplified into a prepare phase which |
| 327 | checks whether the operation is supported by the DSA switch driver, and a commit |
| 328 | phase which applies the changes. |
| 329 | |
| 330 | As of today, the only SWITCHDEV objects supported by DSA are the FDB and VLAN |
| 331 | objects. |
| 332 | |
| 333 | Device Tree |
| 334 | ----------- |
| 335 | |
| 336 | DSA features a standardized binding which is documented in |
| 337 | ``Documentation/devicetree/bindings/net/dsa/dsa.txt``. PHY/MDIO library helper |
| 338 | functions such as ``of_get_phy_mode()``, ``of_phy_connect()`` are also used to query |
| 339 | per-port PHY specific details: interface connection, MDIO bus location etc.. |
| 340 | |
| 341 | Driver development |
| 342 | ================== |
| 343 | |
| 344 | DSA switch drivers need to implement a dsa_switch_ops structure which will |
| 345 | contain the various members described below. |
| 346 | |
| 347 | ``register_switch_driver()`` registers this dsa_switch_ops in its internal list |
| 348 | of drivers to probe for. ``unregister_switch_driver()`` does the exact opposite. |
| 349 | |
| 350 | Unless requested differently by setting the priv_size member accordingly, DSA |
| 351 | does not allocate any driver private context space. |
| 352 | |
| 353 | Switch configuration |
| 354 | -------------------- |
| 355 | |
| 356 | - ``tag_protocol``: this is to indicate what kind of tagging protocol is supported, |
| 357 | should be a valid value from the ``dsa_tag_protocol`` enum |
| 358 | |
| 359 | - ``probe``: probe routine which will be invoked by the DSA platform device upon |
| 360 | registration to test for the presence/absence of a switch device. For MDIO |
| 361 | devices, it is recommended to issue a read towards internal registers using |
| 362 | the switch pseudo-PHY and return whether this is a supported device. For other |
| 363 | buses, return a non-NULL string |
| 364 | |
| 365 | - ``setup``: setup function for the switch, this function is responsible for setting |
| 366 | up the ``dsa_switch_ops`` private structure with all it needs: register maps, |
| 367 | interrupts, mutexes, locks etc.. This function is also expected to properly |
| 368 | configure the switch to separate all network interfaces from each other, that |
| 369 | is, they should be isolated by the switch hardware itself, typically by creating |
| 370 | a Port-based VLAN ID for each port and allowing only the CPU port and the |
| 371 | specific port to be in the forwarding vector. Ports that are unused by the |
| 372 | platform should be disabled. Past this function, the switch is expected to be |
| 373 | fully configured and ready to serve any kind of request. It is recommended |
| 374 | to issue a software reset of the switch during this setup function in order to |
| 375 | avoid relying on what a previous software agent such as a bootloader/firmware |
| 376 | may have previously configured. |
| 377 | |
| 378 | PHY devices and link management |
| 379 | ------------------------------- |
| 380 | |
| 381 | - ``get_phy_flags``: Some switches are interfaced to various kinds of Ethernet PHYs, |
| 382 | if the PHY library PHY driver needs to know about information it cannot obtain |
| 383 | on its own (e.g.: coming from switch memory mapped registers), this function |
| 384 | should return a 32-bits bitmask of "flags", that is private between the switch |
| 385 | driver and the Ethernet PHY driver in ``drivers/net/phy/\*``. |
| 386 | |
| 387 | - ``phy_read``: Function invoked by the DSA slave MDIO bus when attempting to read |
| 388 | the switch port MDIO registers. If unavailable, return 0xffff for each read. |
| 389 | For builtin switch Ethernet PHYs, this function should allow reading the link |
| 390 | status, auto-negotiation results, link partner pages etc.. |
| 391 | |
| 392 | - ``phy_write``: Function invoked by the DSA slave MDIO bus when attempting to write |
| 393 | to the switch port MDIO registers. If unavailable return a negative error |
| 394 | code. |
| 395 | |
| 396 | - ``adjust_link``: Function invoked by the PHY library when a slave network device |
| 397 | is attached to a PHY device. This function is responsible for appropriately |
| 398 | configuring the switch port link parameters: speed, duplex, pause based on |
| 399 | what the ``phy_device`` is providing. |
| 400 | |
| 401 | - ``fixed_link_update``: Function invoked by the PHY library, and specifically by |
| 402 | the fixed PHY driver asking the switch driver for link parameters that could |
| 403 | not be auto-negotiated, or obtained by reading the PHY registers through MDIO. |
| 404 | This is particularly useful for specific kinds of hardware such as QSGMII, |
| 405 | MoCA or other kinds of non-MDIO managed PHYs where out of band link |
| 406 | information is obtained |
| 407 | |
| 408 | Ethtool operations |
| 409 | ------------------ |
| 410 | |
| 411 | - ``get_strings``: ethtool function used to query the driver's strings, will |
| 412 | typically return statistics strings, private flags strings etc. |
| 413 | |
| 414 | - ``get_ethtool_stats``: ethtool function used to query per-port statistics and |
| 415 | return their values. DSA overlays slave network devices general statistics: |
| 416 | RX/TX counters from the network device, with switch driver specific statistics |
| 417 | per port |
| 418 | |
| 419 | - ``get_sset_count``: ethtool function used to query the number of statistics items |
| 420 | |
| 421 | - ``get_wol``: ethtool function used to obtain Wake-on-LAN settings per-port, this |
| 422 | function may, for certain implementations also query the master network device |
| 423 | Wake-on-LAN settings if this interface needs to participate in Wake-on-LAN |
| 424 | |
| 425 | - ``set_wol``: ethtool function used to configure Wake-on-LAN settings per-port, |
| 426 | direct counterpart to set_wol with similar restrictions |
| 427 | |
| 428 | - ``set_eee``: ethtool function which is used to configure a switch port EEE (Green |
| 429 | Ethernet) settings, can optionally invoke the PHY library to enable EEE at the |
| 430 | PHY level if relevant. This function should enable EEE at the switch port MAC |
| 431 | controller and data-processing logic |
| 432 | |
| 433 | - ``get_eee``: ethtool function which is used to query a switch port EEE settings, |
| 434 | this function should return the EEE state of the switch port MAC controller |
| 435 | and data-processing logic as well as query the PHY for its currently configured |
| 436 | EEE settings |
| 437 | |
| 438 | - ``get_eeprom_len``: ethtool function returning for a given switch the EEPROM |
| 439 | length/size in bytes |
| 440 | |
| 441 | - ``get_eeprom``: ethtool function returning for a given switch the EEPROM contents |
| 442 | |
| 443 | - ``set_eeprom``: ethtool function writing specified data to a given switch EEPROM |
| 444 | |
| 445 | - ``get_regs_len``: ethtool function returning the register length for a given |
| 446 | switch |
| 447 | |
| 448 | - ``get_regs``: ethtool function returning the Ethernet switch internal register |
| 449 | contents. This function might require user-land code in ethtool to |
| 450 | pretty-print register values and registers |
| 451 | |
| 452 | Power management |
| 453 | ---------------- |
| 454 | |
| 455 | - ``suspend``: function invoked by the DSA platform device when the system goes to |
| 456 | suspend, should quiesce all Ethernet switch activities, but keep ports |
| 457 | participating in Wake-on-LAN active as well as additional wake-up logic if |
| 458 | supported |
| 459 | |
| 460 | - ``resume``: function invoked by the DSA platform device when the system resumes, |
| 461 | should resume all Ethernet switch activities and re-configure the switch to be |
| 462 | in a fully active state |
| 463 | |
| 464 | - ``port_enable``: function invoked by the DSA slave network device ndo_open |
| 465 | function when a port is administratively brought up, this function should be |
| 466 | fully enabling a given switch port. DSA takes care of marking the port with |
| 467 | ``BR_STATE_BLOCKING`` if the port is a bridge member, or ``BR_STATE_FORWARDING`` if it |
| 468 | was not, and propagating these changes down to the hardware |
| 469 | |
| 470 | - ``port_disable``: function invoked by the DSA slave network device ndo_close |
| 471 | function when a port is administratively brought down, this function should be |
| 472 | fully disabling a given switch port. DSA takes care of marking the port with |
| 473 | ``BR_STATE_DISABLED`` and propagating changes to the hardware if this port is |
| 474 | disabled while being a bridge member |
| 475 | |
| 476 | Bridge layer |
| 477 | ------------ |
| 478 | |
| 479 | - ``port_bridge_join``: bridge layer function invoked when a given switch port is |
| 480 | added to a bridge, this function should be doing the necessary at the switch |
| 481 | level to permit the joining port from being added to the relevant logical |
| 482 | domain for it to ingress/egress traffic with other members of the bridge. |
| 483 | |
| 484 | - ``port_bridge_leave``: bridge layer function invoked when a given switch port is |
| 485 | removed from a bridge, this function should be doing the necessary at the |
| 486 | switch level to deny the leaving port from ingress/egress traffic from the |
| 487 | remaining bridge members. When the port leaves the bridge, it should be aged |
| 488 | out at the switch hardware for the switch to (re) learn MAC addresses behind |
| 489 | this port. |
| 490 | |
| 491 | - ``port_stp_state_set``: bridge layer function invoked when a given switch port STP |
| 492 | state is computed by the bridge layer and should be propagated to switch |
| 493 | hardware to forward/block/learn traffic. The switch driver is responsible for |
| 494 | computing a STP state change based on current and asked parameters and perform |
| 495 | the relevant ageing based on the intersection results |
| 496 | |
| 497 | Bridge VLAN filtering |
| 498 | --------------------- |
| 499 | |
| 500 | - ``port_vlan_filtering``: bridge layer function invoked when the bridge gets |
| 501 | configured for turning on or off VLAN filtering. If nothing specific needs to |
| 502 | be done at the hardware level, this callback does not need to be implemented. |
| 503 | When VLAN filtering is turned on, the hardware must be programmed with |
| 504 | rejecting 802.1Q frames which have VLAN IDs outside of the programmed allowed |
| 505 | VLAN ID map/rules. If there is no PVID programmed into the switch port, |
| 506 | untagged frames must be rejected as well. When turned off the switch must |
| 507 | accept any 802.1Q frames irrespective of their VLAN ID, and untagged frames are |
| 508 | allowed. |
| 509 | |
| 510 | - ``port_vlan_prepare``: bridge layer function invoked when the bridge prepares the |
| 511 | configuration of a VLAN on the given port. If the operation is not supported |
| 512 | by the hardware, this function should return ``-EOPNOTSUPP`` to inform the bridge |
| 513 | code to fallback to a software implementation. No hardware setup must be done |
| 514 | in this function. See port_vlan_add for this and details. |
| 515 | |
| 516 | - ``port_vlan_add``: bridge layer function invoked when a VLAN is configured |
| 517 | (tagged or untagged) for the given switch port |
| 518 | |
| 519 | - ``port_vlan_del``: bridge layer function invoked when a VLAN is removed from the |
| 520 | given switch port |
| 521 | |
| 522 | - ``port_vlan_dump``: bridge layer function invoked with a switchdev callback |
| 523 | function that the driver has to call for each VLAN the given port is a member |
| 524 | of. A switchdev object is used to carry the VID and bridge flags. |
| 525 | |
| 526 | - ``port_fdb_add``: bridge layer function invoked when the bridge wants to install a |
| 527 | Forwarding Database entry, the switch hardware should be programmed with the |
| 528 | specified address in the specified VLAN Id in the forwarding database |
| 529 | associated with this VLAN ID. If the operation is not supported, this |
| 530 | function should return ``-EOPNOTSUPP`` to inform the bridge code to fallback to |
| 531 | a software implementation. |
| 532 | |
| 533 | .. note:: VLAN ID 0 corresponds to the port private database, which, in the context |
| 534 | of DSA, would be its port-based VLAN, used by the associated bridge device. |
| 535 | |
| 536 | - ``port_fdb_del``: bridge layer function invoked when the bridge wants to remove a |
| 537 | Forwarding Database entry, the switch hardware should be programmed to delete |
| 538 | the specified MAC address from the specified VLAN ID if it was mapped into |
| 539 | this port forwarding database |
| 540 | |
| 541 | - ``port_fdb_dump``: bridge layer function invoked with a switchdev callback |
| 542 | function that the driver has to call for each MAC address known to be behind |
| 543 | the given port. A switchdev object is used to carry the VID and FDB info. |
| 544 | |
| 545 | - ``port_mdb_prepare``: bridge layer function invoked when the bridge prepares the |
| 546 | installation of a multicast database entry. If the operation is not supported, |
| 547 | this function should return ``-EOPNOTSUPP`` to inform the bridge code to fallback |
| 548 | to a software implementation. No hardware setup must be done in this function. |
| 549 | See ``port_fdb_add`` for this and details. |
| 550 | |
| 551 | - ``port_mdb_add``: bridge layer function invoked when the bridge wants to install |
| 552 | a multicast database entry, the switch hardware should be programmed with the |
| 553 | specified address in the specified VLAN ID in the forwarding database |
| 554 | associated with this VLAN ID. |
| 555 | |
| 556 | .. note:: VLAN ID 0 corresponds to the port private database, which, in the context |
| 557 | of DSA, would be its port-based VLAN, used by the associated bridge device. |
| 558 | |
| 559 | - ``port_mdb_del``: bridge layer function invoked when the bridge wants to remove a |
| 560 | multicast database entry, the switch hardware should be programmed to delete |
| 561 | the specified MAC address from the specified VLAN ID if it was mapped into |
| 562 | this port forwarding database. |
| 563 | |
| 564 | - ``port_mdb_dump``: bridge layer function invoked with a switchdev callback |
| 565 | function that the driver has to call for each MAC address known to be behind |
| 566 | the given port. A switchdev object is used to carry the VID and MDB info. |
| 567 | |
| 568 | TODO |
| 569 | ==== |
| 570 | |
| 571 | Making SWITCHDEV and DSA converge towards an unified codebase |
| 572 | ------------------------------------------------------------- |
| 573 | |
| 574 | SWITCHDEV properly takes care of abstracting the networking stack with offload |
| 575 | capable hardware, but does not enforce a strict switch device driver model. On |
| 576 | the other DSA enforces a fairly strict device driver model, and deals with most |
| 577 | of the switch specific. At some point we should envision a merger between these |
| 578 | two subsystems and get the best of both worlds. |
| 579 | |
| 580 | Other hanging fruits |
| 581 | -------------------- |
| 582 | |
| 583 | - making the number of ports fully dynamic and not dependent on ``DSA_MAX_PORTS`` |
| 584 | - allowing more than one CPU/management interface: |
| 585 | http://comments.gmane.org/gmane.linux.network/365657 |
| 586 | - porting more drivers from other vendors: |
| 587 | http://comments.gmane.org/gmane.linux.network/365510 |