net: sched: allow qdiscs to share filter block instances
From: | Jiri Pirko <jiri-AT-resnulli.us> | |
To: | netdev-AT-vger.kernel.org | |
Subject: | [patch net-next 00/34] net: sched: allow qdiscs to share filter block instances | |
Date: | Thu, 12 Oct 2017 19:17:49 +0200 | |
Message-ID: | <20171012171823.1431-1-jiri@resnulli.us> | |
Cc: | davem-AT-davemloft.net, jhs-AT-mojatatu.com, xiyou.wangcong-AT-gmail.com, mlxsw-AT-mellanox.com, andrew-AT-lunn.ch, vivien.didelot-AT-savoirfairelinux.com, f.fainelli-AT-gmail.com, michael.chan-AT-broadcom.com, ganeshgr-AT-chelsio.com, jeffrey.t.kirsher-AT-intel.com, saeedm-AT-mellanox.com, matanb-AT-mellanox.com, leonro-AT-mellanox.com, idosch-AT-mellanox.com, jakub.kicinski-AT-netronome.com, ast-AT-kernel.org, daniel-AT-iogearbox.net, simon.horman-AT-netronome.com, pieter.jansenvanvuuren-AT-netronome.com, john.hurley-AT-netronome.com, edumazet-AT-google.com, dsahern-AT-gmail.com, alexander.h.duyck-AT-intel.com, john.fastabend-AT-gmail.com, willemb-AT-google.com |
From: Jiri Pirko <jiri@mellanox.com> First of all, I would like to apologize for big patchset. However after couple of hours trying to figure out how to cut it, I found out it is actually not possible. I would have to add some interface in one patchset and only use it in second, which is forbidden. Also, I would like to provide the reviewer the full picture. Most of the patches are small and contained anyway, so it should be easy to review them. But to the motivation: Currently the filters added to qdiscs are independent. So for example if you have 2 netdevices and you create ingress qdisc on both and you want to add identical filter rules both, you need to add them twice. This patchset makes this easier and mainly saves resources allowing to share all filters within a qdisc - I call it a "filter block". Also this helps to save resources when we do offload to hw for example to expensive TCAM. So back to the example. First, we create 2 qdiscs. Both will share block number 22. "22" is just an identification. If we don't pass any block number, a new one will be generated by kernel: $ tc qdisc add dev ens7 ingress block 22 ^^^^^^^^ $ tc qdisc add dev ens8 ingress block 22 ^^^^^^^^ Now if we list the qdiscs, we will see the block index in the output: $ tc qdisc qdisc ingress ffff: dev ens7 parent ffff:fff1 block 22 qdisc ingress ffff: dev ens8 parent ffff:fff1 block 22 Now we can add filter to any of qdiscs sharing the same block: $ tc filter add dev ens7 parent ffff: protocol ip pref 25 flower dst_ip 192.168.0.0/16 action drop We will see the same output if we list filters for ens7 and ens8, including stats: $ tc -s filter show dev ens7 ingress filter protocol ip pref 25 flower chain 0 filter protocol ip pref 25 flower chain 0 handle 0x1 eth_type ipv4 dst_ip 192.168.0.0/16 not_in_hw action order 1: gact action drop random type none pass val 0 index 1 ref 1 bind 1 installed 39 sec used 2 sec Action statistics: Sent 3108 bytes 37 pkt (dropped 37, overlimits 0 requeues 0) backlog 0b 0p requeues 0 $ tc -s filter show dev ens8 ingress filter protocol ip pref 25 flower chain 0 filter protocol ip pref 25 flower chain 0 handle 0x1 eth_type ipv4 dst_ip 192.168.0.0/16 not_in_hw action order 1: gact action drop random type none pass val 0 index 1 ref 1 bind 1 installed 40 sec used 3 sec Action statistics: Sent 3108 bytes 37 pkt (dropped 37, overlimits 0 requeues 0) backlog 0b 0p requeues 0 Patches overview: Patches 1-3 introduce infrastructure for block sharing and the interface funtions to the qdisc, tcf_block_get_ext and tcf_block_put_ext Patches 4-11 are removing usages of tp->q pointer, which needs to be eventually removed in order to set the tfc_proto independent on a qdisc instance Patches 12-19 introduces block callbacks, internal infra and driver-facing interface, they add callback calling to individual classifiers Patches 20-28 convert individual drivers from ndo_setup_tc to block callbacks for classifiers offloading Patches 29-31 remove unused things due to the previous conversion Patch 32 introduces block mechanism to handle netif_keep_dst calls Patch 33 removes tp->q and tp->classid - makes tcf_proto independent on qdisc Patch 34 finally enables block sharing for cls_ingress and cls_clsact Iproute2 implementation is here: https://github.com/jpirko/iproute2_mlxsw/commit/f91ff81e3... The next patchset will introduce block sharing for mlxsw. For the curious ones the patches could be found here: https://github.com/jpirko/linux_mlxsw/commits/jiri_devel_... Jiri Pirko (34): net: sched: store Qdisc pointer in struct block net: sched: introduce support for multiple filter chain pointers registration net: sched: introduce shared filter blocks infrastructure net: sched: teach tcf_bind/unbind_filter to use block->q net: sched: ematch: obtain net pointer from blocks net: core: use dev->ingress_queue instead of tp->q net: sched: cls_u32: use block instead of q in tc_u_common net: sched: avoid usage of tp->q in tcf_classify net: sched: tcindex, fw, flow: use tcf_block_q helper to get struct Qdisc net: sched: use tcf_block_q helper to get q pointer for sch_tree_lock net: sched: propagate q and parent from caller down to tcf_fill_node net: sched: add block bind/unbind notification to drivers net: sched: introduce per-block callbacks net: sched: use extended variants of block get and put in ingress and clsact qdiscs net: sched: use tc_setup_cb_call to call per-block callbacks net: sched: cls_matchall: call block callbacks for offload net: sched: cls_u32: swap u32_remove_hw_knode and u32_remove_hw_hnode net: sched: cls_u32: call block callbacks for offload net: sched: cls_bpf: call block callbacks for offload mlxsw: spectrum: Convert ndo_setup_tc offloads to block callbacks mlx5e: Convert ndo_setup_tc offloads to block callbacks bnxt: Convert ndo_setup_tc offloads to block callbacks cxgb4: Convert ndo_setup_tc offloads to block callbacks ixgbe: Convert ndo_setup_tc offloads to block callbacks mlx5e_rep: Convert ndo_setup_tc offloads to block callbacks nfp: flower: Convert ndo_setup_tc offloads to block callbacks nfp: bpf: Convert ndo_setup_tc offloads to block callbacks dsa: Convert ndo_setup_tc offloads to block callbacks net: sched: avoid ndo_setup_tc calls for TC_SETUP_CLS* net: sched: remove unused classid field from tc_cls_common_offload net: sched: remove unused is_classid_clsact_ingress/egress helpers net: sched: introduce block mechanism to handle netif_keep_dst calls net: sched: remove classid and q fields from tcf_proto net: sched: allow ingress and clsact qdiscs to share filter blocks drivers/net/ethernet/broadcom/bnxt/bnxt.c | 37 +- drivers/net/ethernet/broadcom/bnxt/bnxt_tc.c | 3 +- drivers/net/ethernet/broadcom/bnxt/bnxt_vfr.c | 41 +- drivers/net/ethernet/chelsio/cxgb4/cxgb4_main.c | 42 +- drivers/net/ethernet/intel/ixgbe/ixgbe_main.c | 45 +- drivers/net/ethernet/mellanox/mlx5/core/en.h | 4 +- drivers/net/ethernet/mellanox/mlx5/core/en_main.c | 45 +- drivers/net/ethernet/mellanox/mlx5/core/en_rep.c | 62 ++- drivers/net/ethernet/mellanox/mlxsw/spectrum.c | 83 +++- drivers/net/ethernet/netronome/nfp/bpf/main.c | 51 +- drivers/net/ethernet/netronome/nfp/bpf/offload.c | 4 + .../net/ethernet/netronome/nfp/flower/offload.c | 54 ++- include/linux/netdevice.h | 1 + include/net/pkt_cls.h | 195 +++++++- include/net/pkt_sched.h | 14 +- include/net/sch_generic.h | 15 +- include/uapi/linux/pkt_sched.h | 12 + net/core/dev.c | 21 +- net/dsa/slave.c | 64 ++- net/sched/cls_api.c | 527 +++++++++++++++++++-- net/sched/cls_bpf.c | 32 +- net/sched/cls_flow.c | 9 +- net/sched/cls_flower.c | 29 +- net/sched/cls_fw.c | 5 +- net/sched/cls_matchall.c | 58 +-- net/sched/cls_route.c | 2 +- net/sched/cls_tcindex.c | 5 +- net/sched/cls_u32.c | 79 ++- net/sched/ematch.c | 2 +- net/sched/sch_api.c | 6 +- net/sched/sch_atm.c | 4 +- net/sched/sch_cbq.c | 2 +- net/sched/sch_drr.c | 2 +- net/sched/sch_dsmark.c | 2 +- net/sched/sch_fq_codel.c | 2 +- net/sched/sch_hfsc.c | 4 +- net/sched/sch_htb.c | 4 +- net/sched/sch_ingress.c | 123 ++++- net/sched/sch_multiq.c | 2 +- net/sched/sch_prio.c | 2 +- net/sched/sch_qfq.c | 2 +- net/sched/sch_sfb.c | 2 +- net/sched/sch_sfq.c | 2 +- 43 files changed, 1370 insertions(+), 330 deletions(-) -- 2.9.5