Commit Graph

46052 Commits

Author SHA1 Message Date
Tom Herbert
c6cc1ca7f4 flowi: Abstract out functions to get flow hash based on flowi
Create __get_hash_from_flowi6 and __get_hash_from_flowi4 to get the
flow keys and hash based on flowi structures. These are called by
__skb_get_hash_flowi6 and __skb_get_hash_flowi4. Also, created
get_hash_from_flowi6 and get_hash_from_flowi4 which can be called
when just the hash value for a flowi is needed.

Signed-off-by: Tom Herbert <tom@herbertland.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-09-01 15:06:22 -07:00
Tom Herbert
bcc83839ff skbuff: Make __skb_set_sw_hash a general function
Move __skb_set_sw_hash to skbuff.h and add __skb_set_hash which is
a common method (between __skb_set_sw_hash and skb_set_hash) to set
the hash in an skbuff.

Also, move skb_clear_hash to be closer to __skb_set_hash.

Signed-off-by: Tom Herbert <tom@herbertland.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-09-01 15:06:22 -07:00
Tom Herbert
e5276937ae flow_dissector: Move skb related functions to skbuff.h
Move the flow dissector functions that are specific to skbuffs into
skbuff.h out of flow_dissector.h. This makes flow_dissector.h have
no dependencies on skbuff.h.

Signed-off-by: Tom Herbert <tom@herbertland.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-09-01 15:06:21 -07:00
Andrew Lunn
a5597008db phy: fixed_phy: Add gpio to determine link up/down.
An SFP module may have a link up/down status pin which can be
connection to a GPIO line of the host. Add support for reading such an
GPIO in the fixed_phy driver.

Signed-off-by: Andrew Lunn <andrew@lunn.ch>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-08-31 14:48:02 -07:00
Florian Fainelli
5a11dd7d96 net: phy: Allow PHY devices to identify themselves as Ethernet switches, etc.
Some Ethernet MAC drivers using the PHY library require the hardcoding
of link parameters when interfaced to a switch device, SFP module,
switch to switch port, etc. This has typically lead to various ad-hoc
implementations looking like this:

- using a "fixed PHY" emulated device, which will provide link
  indication towards the Ethernet MAC driver and hardware

- pretend there is no PHY and hardcode link parameters, ala mv643x_eth

Based on that, it is desireable to have the PHY drivers advertise the
correct link parameters, just like regular Ethernet PHYs towards their
CPU Ethernet MAC drivers, however, Ethernet MAC drivers should be able
to tell whether this link should be monitored or not. In the context
of an Ethernet switch, SFP module, switch to switch link, we do not
need to monitor this link since it should be always up.

Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-08-31 14:48:01 -07:00
Alexei Starovoitov
1a6877b9c0 lib: introduce strncpy_from_unsafe()
generalize FETCH_FUNC_NAME(memory, string) into
strncpy_from_unsafe() and fix sparse warnings that were
present in original implementation.

Signed-off-by: Alexei Starovoitov <ast@plumgrid.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-08-28 16:27:27 -07:00
Philip Downey
df2cf4a78e IGMP: Inhibit reports for local multicast groups
The range of addresses between 224.0.0.0 and 224.0.0.255 inclusive, is
reserved for the use of routing protocols and other low-level topology
discovery or maintenance protocols, such as gateway discovery and
group membership reporting.  Multicast routers should not forward any
multicast datagram with destination addresses in this range,
regardless of its TTL.

Currently, IGMP reports are generated for this reserved range of
addresses even though a router will ignore this information since it
has no purpose.  However, the presence of reserved group addresses in
an IGMP membership report uses up network bandwidth and can also
obscure addresses of interest when inspecting membership reports using
packet inspection or debug messages.

Although the RFCs for the various version of IGMP (e.g.RFC 3376 for
v3) do not specify that the reserved addresses be excluded from
membership reports, it should do no harm in doing so.  In particular
there should be no adverse effect in any IGMP snooping functionality
since 224.0.0.x is specifically excluded as per RFC 4541 (IGMP and MLD
Snooping Switches Considerations) section 2.1.2. Data Forwarding
Rules:

    2) Packets with a destination IP (DIP) address in the 224.0.0.X
       range which are not IGMP must be forwarded on all ports.

IGMP reports for local multicast groups can now be optionally
inhibited by means of a system control variable (by setting the value
to zero) e.g.:
    echo 0 > /proc/sys/net/ipv4/igmp_link_local_mcast_reports

To retain backwards compatibility the previous behaviour is retained
by default on system boot or reverted by setting the value back to
non-zero e.g.:
    echo 1 >  /proc/sys/net/ipv4/igmp_link_local_mcast_reports

Signed-off-by: Philip Downey <pdowney@brocade.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-08-28 13:28:47 -07:00
David S. Miller
0d36938bb8 Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net 2015-08-27 21:45:31 -07:00
Joe Stringer
2e4cfae2a8 netfilter: Define v6ops in !CONFIG_NETFILTER case.
When CONFIG_OPENVSWITCH is set, and CONFIG_NETFILTER is not set, the
openvswitch IPv6 fragmentation handling cannot refer to ipv6_ops because
it isn't defined. Add a dummy version to avoid #ifdefs in source files.

Fixes: 7f8a436 "openvswitch: Add conntrack action"
Signed-off-by: Joe Stringer <joestringer@nicira.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-08-27 16:35:51 -07:00
Jiri Pirko
0dc1549bfd net: kill long time unused bonding private flags
We don't use them for years, just kill them now.

Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-08-27 16:28:35 -07:00
Jiri Pirko
35d4e17252 net: add netif_is_ovs_master helper with IFF_OPENVSWITCH private flag
Add this helper so code can easily figure out if netdev is openswitch.

Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-08-27 16:28:35 -07:00
Jiri Pirko
0894ae3f0a net: add netif_is_bridge_master helper
Add this helper so code can easily figure out if netdev is a bridge.

Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-08-27 16:28:34 -07:00
Jiri Pirko
0e4ead9d7b net: introduce change upper device notifier change info
Add info that is passed along with NETDEV_CHANGEUPPER event.

Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-08-27 16:28:34 -07:00
Guilherme G. Piccoli
22b6839b91 PCI: Make pci_msi_setup_pci_dev() non-static for use by arch code
Commit 1851617cd2 ("PCI/MSI: Disable MSI at enumeration even if kernel
doesn't support MSI") changed the location of the code that initialises
dev->msi_cap/msix_cap and then disables MSI/MSI-X interrupts at PCI
probe time in devices that have this flag set. It moved the code from
pci_msi_init_pci_dev() to a new function named pci_msi_setup_pci_dev(),
called by pci_setup_device().

The pseries PCI probing code does not call pci_setup_device(), so since
the aforementioned commit the function pci_msi_setup_pci_dev() is not
called and MSI/MSI-X interrupts are left enabled. Additionally because
dev->msi_cap/msix_cap are not initialised no driver can ever enable
MSI/MSI-X.

To fix this, the pseries PCI probe should manually call
pci_msi_setup_pci_dev(), so this patch makes it non-static.

Fixes: 1851617cd2 ("PCI/MSI: Disable MSI at enumeration even if kernel doesn't support MSI")
[mpe: Update change log to mention dev->msi_cap/msix_cap]
Signed-off-by: Guilherme G. Piccoli <gpiccoli@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2015-08-26 21:40:49 +10:00
David S. Miller
d9893d1351 Merge tag 'nfc-next-4.3-1' of git://git.kernel.org/pub/scm/linux/kernel/git/sameo/nfc-next
Samuel Ortiz says:

====================
NFC 4.3 pull request

This is the NFC pull request for 4.3.
With this one we have:

- A new driver for Samsung's S3FWRN5 NFC chipset. In order to
  properly support this driver, a few NCI core routines needed
  to be exported. Future drivers like Intel's Fields Peak will
  benefit from this.

- SPI support as a physical transport for STM st21nfcb.

- An additional netlink API for sending replies back to userspace
  from vendor commands.

- 2 small fixes for TI's trf7970a

- A few st-nci fixes.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2015-08-23 20:42:57 -07:00
Tom Herbert
b7fe10e5eb gro: Fix remcsum offload to deal with frags in GRO
The remote checksum offload GRO did not consider the case that frag0
might be in use. This patch fixes that by accessing headers using the
skb_gro functions and not saving offsets relative to skb->head.

Signed-off-by: Tom Herbert <tom@herbertland.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-08-23 15:59:56 -07:00
Linus Torvalds
84f3fe4608 Merge branch 'irq-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull irq fixes from Thomas Gleixner:
 "A series of small fixlets for a regression visible on OMAP devices
  caused by the conversion of the OMAP interrupt chips to hierarchical
  interrupt domains.  Mostly one liners on the driver side plus a small
  helper function in the core to avoid open coded mess in the drivers"

* 'irq-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  irqchip/crossbar: Restore set_wake functionality
  irqchip/crossbar: Restore the mask on suspend behaviour
  ARM: OMAP: wakeupgen: Restore the irq_set_type() mechanism
  irqchip/crossbar: Restore the irq_set_type() mechanism
  genirq: Introduce irq_chip_set_type_parent() helper
  genirq: Don't return ENOSYS in irq_chip_retrigger_hierarchy
2015-08-22 07:45:36 -07:00
Michal Hocko
2f064f3485 mm: make page pfmemalloc check more robust
Commit c48a11c7ad ("netvm: propagate page->pfmemalloc to skb") added
checks for page->pfmemalloc to __skb_fill_page_desc():

        if (page->pfmemalloc && !page->mapping)
                skb->pfmemalloc = true;

It assumes page->mapping == NULL implies that page->pfmemalloc can be
trusted.  However, __delete_from_page_cache() can set set page->mapping
to NULL and leave page->index value alone.  Due to being in union, a
non-zero page->index will be interpreted as true page->pfmemalloc.

So the assumption is invalid if the networking code can see such a page.
And it seems it can.  We have encountered this with a NFS over loopback
setup when such a page is attached to a new skbuf.  There is no copying
going on in this case so the page confuses __skb_fill_page_desc which
interprets the index as pfmemalloc flag and the network stack drops
packets that have been allocated using the reserves unless they are to
be queued on sockets handling the swapping which is the case here and
that leads to hangs when the nfs client waits for a response from the
server which has been dropped and thus never arrive.

The struct page is already heavily packed so rather than finding another
hole to put it in, let's do a trick instead.  We can reuse the index
again but define it to an impossible value (-1UL).  This is the page
index so it should never see the value that large.  Replace all direct
users of page->pfmemalloc by page_is_pfmemalloc which will hide this
nastiness from unspoiled eyes.

The information will get lost if somebody wants to use page->index
obviously but that was the case before and the original code expected
that the information should be persisted somewhere else if that is
really needed (e.g.  what SLAB and SLUB do).

[akpm@linux-foundation.org: fix blooper in slub]
Fixes: c48a11c7ad ("netvm: propagate page->pfmemalloc to skb")
Signed-off-by: Michal Hocko <mhocko@suse.com>
Debugged-by: Vlastimil Babka <vbabka@suse.com>
Debugged-by: Jiri Bohac <jbohac@suse.com>
Cc: Eric Dumazet <eric.dumazet@gmail.com>
Cc: David Miller <davem@davemloft.net>
Acked-by: Mel Gorman <mgorman@suse.de>
Cc: <stable@vger.kernel.org>	[3.6+]
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>

Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2015-08-21 14:30:10 -07:00
David S. Miller
dc25b25897 Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net
Conflicts:
	drivers/net/usb/qmi_wwan.c

Overlapping additions of new device IDs to qmi_wwan.c

Signed-off-by: David S. Miller <davem@davemloft.net>
2015-08-21 11:44:04 -07:00
Pablo Neira Ayuso
81bf1c64e7 Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next
Resolve conflicts with conntrack template fixes.

Conflicts:
	net/netfilter/nf_conntrack_core.c
	net/netfilter/nf_synproxy_core.c
	net/netfilter/xt_CT.c

Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2015-08-21 06:09:05 +02:00
David S. Miller
ef09242f39 Merge tag 'wireless-drivers-next-for-davem-2015-08-19' of git://git.kernel.org/pub/scm/linux/kernel/git/kvalo/wireless-drivers-next
Kalle Valo says:

====================
Major changes:

ath10k:

* add support for qca99x0 family of devices
* improve performance of tx_lock
* add support for raw mode (802.11 frame format) and software crypto
  engine enabled via a module parameter

ath9k:

* add fast-xmit support

wil6210:

* implement TSO support
* support bootloader v1 and onwards

iwlwifi:

* Deprecate -10.ucode
* Clean ups towards multiple Rx queues
* Add support for longer CMD IDs. This will be required by new
  firmwares since we are getting close to the u8 limit.
* bugfixes for the D0i3 power state
* Add basic support for FTM
* polish the Miracast operation
* fix a few power consumption issues
* scan cleanup
* fixes for D0i3 system state
* add paging for devices that support it
* add again the new RBD allocation model
* add more options to the firmware debug system
* add support for frag SKBs in Tx
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2015-08-20 14:13:25 -07:00
Johannes Berg
f4e774f55f average: remove out-of-line implementation
Since all users are now converted to the inline implementation,
remove the out-of-line implementation entirely.

Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-08-20 14:10:23 -07:00
Grygorii Strashko
b7560de198 genirq: Introduce irq_chip_set_type_parent() helper
This helper is required for irq chips which do not implement a
irq_set_type callback and need to call down the irq domain hierarchy
for the actual trigger type change.

This helper is required to fix further wreckage caused by the
conversion of TI OMAP to hierarchical irq domains and therefor tagged
for stable.

[ tglx: Massaged changelog ]

Signed-off-by: Grygorii Strashko <grygorii.strashko@ti.com>
Cc: Sudeep Holla <sudeep.holla@arm.com>
Cc: <linux@arm.linux.org.uk>
Cc: <nsekhar@ti.com>
Cc: <jason@lakedaemon.net>
Cc: <balbi@ti.com>
Cc: <linux-arm-kernel@lists.infradead.org>
Cc: <tony@atomide.com>
Cc: <marc.zyngier@arm.com>
Cc: stable@vger.kernel.org # 4.1
Link: http://lkml.kernel.org/r/1439554830-19502-3-git-send-email-grygorii.strashko@ti.com
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2015-08-20 00:25:25 +02:00
Linus Walleij
74f4e0cc61 bcma: switch GPIO portions to use GPIOLIB_IRQCHIP
This switches the BCMA GPIO driver to use GPIOLIB_IRQCHIP to
handle its interrupts instead of rolling its own copy of the
irqdomain handling etc.

Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
Signed-off-by: Hauke Mehrtens <hauke@hauke-m.de>
Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
2015-08-18 09:08:47 +03:00
Kalle Valo
15f6d96ded Merge tag 'mac80211-next-for-davem-2015-08-14' mac80211-next.git
iwlwifi needs new mac80211 patches so merge mac80211-next.git to
wireless-drivers-next.git.
2015-08-18 08:44:22 +03:00
Linus Torvalds
4e7fca0d0a Merge branch 'for-4.2-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/libata
Pull libata fixes from Tejun Heo:
 "Three minor device-specific fixes and revert of NCQ autosense added
  during this -rc1.

  It turned out that NCQ autosense as currently implemented interferes
  with the usual error handling behavior.  It will be revisited in the
  near future"

* 'for-4.2-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/libata:
  ata: ahci_brcmstb: Fix misuse of IS_ENABLED
  sata_sx4: Check return code from pdc20621_i2c_read()
  Revert "libata: Implement NCQ autosense"
  Revert "libata: Implement support for sense data reporting"
  Revert "libata-eh: Set 'information' field for autosense"
  ata: ahci_brcmstb: Fix warnings with CONFIG_PM_SLEEP=n
2015-08-17 16:20:45 -07:00
David Ahern
808d28c468 net: Fix docbook warning for IFF_VRF_MASTER enum
kbuild test robot reported:
tree:   git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next.git master
head:   d52736e24f
commit: 4e3c89920c [751/762] net: Introduce VRF related flags and helpers
reproduce: make htmldocs

>> Warning(include/linux/netdevice.h:1293): Enum value 'IFF_VRF_MASTER' not described in enum 'netdev_priv_flags'

Signed-off-by: David Ahern <dsa@cumulusnetworks.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-08-17 15:56:35 -07:00
David Ahern
2f52bdcf6b net: Updates to netif_index_is_vrf
As Eric noted netif_index_is_vrf is not called with rcu_read_lock held,
so wrap the dev_get_by_index_rcu in rcu_read_lock and unlock.

If VRF is not enabled or oif is 0 skip the device lookup. In both cases
index cannot be the VRF master.

Signed-off-by: David Ahern <dsa@cumulusnetworks.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-08-17 15:55:15 -07:00
Achiad Shochat
3c2d18ef22 net/mlx5e: Support ethtool get/set_pauseparam
Only rx/tx pause settings.
Autoneg setting is currently not supported.

Signed-off-by: Achiad Shochat <achiad@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-08-17 15:51:36 -07:00
Achiad Shochat
6fa1bcab6b net/mlx5e: Ethtool link speed setting fixes
- Port speed settings are applied by the device only upon
  port admin status transition from DOWN to UP.
  So we enforce this transition regardless of the port's
  current operation state (which may be occasionally DOWN if
  for example the network cable is disconnected).
- Fix the PORT_UP/DOWN device interface enum
- Set the local_port bit in the device PAOS register
- EXPORT the PAOS (Port Administrative and Operational Status)
  register set/query access functions.

Signed-off-by: Achiad Shochat <achiad@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-08-17 15:51:35 -07:00
David S. Miller
2bd736fa0d Merge tag 'mac80211-next-for-davem-2015-08-14' of git://git.kernel.org/pub/scm/linux/kernel/git/jberg/mac80211-next
Johannes Berg says:

====================
Another pull request for the next cycle, this time with quite
a bit of content:
 * mesh fixes/improvements from Alexis, Bob, Chun-Yeow and Jesse
 * TDLS higher bandwidth support (Arik)
 * OCB fixes from Bertold Van den Bergh
 * suspend/resume fixes from Eliad
 * dynamic SMPS support for minstrel-HT (Krishna Chaitanya)
 * VHT bitrate mask support (Lorenzo Bianconi)
 * better regulatory support for 5/10 MHz channels (Matthias May)
 * basic support for MU-MIMO to avoid the multi-vif issue (Sara Sharon)
along with a number of other cleanups.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2015-08-17 14:25:04 -07:00
Jesse Brandeburg
fbaff3ef85 net: fix endian check warning in etherdevice.h
Sparse builds have been warning for a really long time now
that etherdevice.h has a conversion that is unsafe.

  include/linux/etherdevice.h:79:32: warning: restricted __be16 degrades to integer

This code change fixes the issue and generates the exact
same assembly before/after (checked on x86_64)

Fixes: 2c722fe1c8 (etherdevice: Optimize a few is_<foo>_ether_addr functions)
Signed-off-by: Jesse Brandeburg <jesse.brandeburg@intel.com>
CC: Joe Perches <joe@perches.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-08-17 12:14:53 -07:00
Phil Sutter
fa8187c964 net: declare new net_device priv_flag IFF_NO_QUEUE
This private net_device flag can be set by drivers to inform that a
device runs fine without a qdisc attached. This was formerly done by
setting tx_queue_len to zero.

Signed-off-by: Phil Sutter <phil@nwl.cc>
Acked-by: Jesper Dangaard Brouer <brouer@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-08-17 11:50:18 -07:00
Christophe Ricard
76b733d158 nfc: st-nci: Remove duplicate file platform_data/st_nci.h
commit "nfc: st-nci: Rename st21nfcb to st-nci" adds
include/linux/platform_data/st_nci.h duplicated with
include/linux/platform_data/st-nci.h.

Only drivers/nfc/st-nci/i2c.c uses platform_data/st_nci.h.

Cc: stable@vger.kernel.org
Reported-by: Hauke Mehrtens <hauke@hauke-m.de>
Signed-off-by: Christophe Ricard <christophe-h.ricard@st.com>
Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>
2015-08-17 00:35:06 +02:00
Emmanuel Grumbach
8f9c98df94 mac80211: fix BIT position for TDLS WIDE extended cap
The bit was not according to ieee80211 specification.
Fix that.

Reviewed-by: Arik Nemtsov <arik@wizery.com>
Signed-off-by: Emmanuel Grumbach <emmanuel.grumbach@intel.com>
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2015-08-14 17:49:53 +02:00
Johannes Berg
2377799c08 average: provide macro to create static EWMA
Having the EWMA parameters stored in the runtime struct imposes
memory requirements for the constant values that could just be
inlined in the code. This particularly makes sense if there are
a lot of such structs, for example in mac80211 in the station
table where each station has a number of these in an array, and
there can be many stations.

Provide a macro DECLARE_EWMA() that declares the necessary struct
and inline functions to access it with the parameters hard-coded;
using this also means the user no longer needs to 'select AVERAGE'
as it's entirely self-contained.

In the mac80211 case, on x86-64, this actually slightly *reduces*
code size, while also saving 80 bytes of runtime memory per sta.

Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2015-08-14 17:49:52 +02:00
David Ahern
4e3c89920c net: Introduce VRF related flags and helpers
Add a VRF_MASTER flag for interfaces and helper functions for determining
if a device is a VRF_MASTER.

Add link attribute for passing VRF_TABLE id.

Add vrf_ptr to netdevice.

Add various macros for determining if a device is a VRF device, the index
of the master VRF device and table associated with VRF device.

Signed-off-by: Shrijeet Mukherjee <shm@cumulusnetworks.com>
Signed-off-by: David Ahern <dsa@cumulusnetworks.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-08-13 22:43:20 -07:00
Andy Gospodarek
35103d1117 net: ipv6 sysctl option to ignore routes when nexthop link is down
Like the ipv4 patch with a similar title, this adds a sysctl to allow
the user to change routing behavior based on whether or not the
interface associated with the nexthop was an up or down link.  The
default setting preserves the current behavior, but anyone that enables
it will notice that nexthops on down interfaces will no longer be
selected:

net.ipv6.conf.all.ignore_routes_with_linkdown = 0
net.ipv6.conf.default.ignore_routes_with_linkdown = 0
net.ipv6.conf.lo.ignore_routes_with_linkdown = 0
...

When the above sysctls are set, not only will link status be reported to
userspace, but an indication that a nexthop is dead and will not be used
is also reported.

1000::/8 via 7000::2 dev p7p1  metric 1024 dead linkdown  pref medium
1000::/8 via 8000::2 dev p8p1  metric 1024  pref medium
7000::/8 dev p7p1  proto kernel  metric 256 dead linkdown  pref medium
8000::/8 dev p8p1  proto kernel  metric 256  pref medium
9000::/8 via 8000::2 dev p8p1  metric 2048  pref medium
9000::/8 via 7000::2 dev p7p1  metric 1024 dead linkdown  pref medium
fe80::/64 dev p7p1  proto kernel  metric 256 dead linkdown  pref medium
fe80::/64 dev p8p1  proto kernel  metric 256  pref medium

This also adds devconf support and notification when sysctl values
change.

v2: drop use of rt6i_nhflags since it is not needed right now

Signed-off-by: Andy Gospodarek <gospo@cumulusnetworks.com>
Signed-off-by: Dinesh Dutt <ddutt@cumulusnetworks.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-08-13 21:27:19 -07:00
Jeremy Linton
4c96b7dc0d Add a matching set of device_ functions for determining mac/phy
OF has some helper functions for parsing MAC and PHY settings.
In cases where the platform is providing this information rather
than the device itself, there needs to be similar functions for ACPI.

These functions are slightly modified versions of the ones in
of_net which can use information provided via DT or ACPI.

Signed-off-by: Jeremy Linton <jeremy.linton@arm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-08-13 16:58:29 -07:00
David S. Miller
182ad468e7 Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net
Conflicts:
	drivers/net/ethernet/cavium/Kconfig

The cavium conflict was overlapping dependency
changes.

Signed-off-by: David S. Miller <davem@davemloft.net>
2015-08-13 16:23:11 -07:00
Linus Torvalds
26b552e0a8 Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net
Pull networking fixes from David Miller:

 1) Workaround hw bug when acquiring PCI bos ownership of iwlwifi
    devices, from Emmanuel Grumbach.

 2) Falling back to vmalloc in conntrack should not emit a warning, from
    Pablo Neira Ayuso.

 3) Fix NULL deref when rtlwifi driver is used as an AP, from Luis
    Felipe Dominguez Vega.

 4) Rocker doesn't free netdev on device removal, from Ido Schimmel.

 5) UDP multicast early sock demux has route handling races, from Eric
    Dumazet.

 6) Fix L4 checksum handling in openvswitch, from Glenn Griffin.

 7) Fix use-after-free in skb_set_peeked, from Herbert Xu.

 8) Don't advertize NETIF_F_FRAGLIST in virtio_net driver, this can lead
    to fraglists longer than the driver can support.  From Jason Wang.

 9) Fix mlx5 on non-4k-pagesize systems, from Carol L Soto.

10) Fix interrupt storm in bna driver, from Ivan Vecera.

11) Don't propagate -EBUSY from netlink_insert(), from Daniel Borkmann.

12) Fix inet request sock leak, from Eric Dumazet.

13) Fix TX interrupt masking and marking in TX descriptors of fs_enet
    driver, from LEROY Christophe.

14) Get rid of rule optimizer in gianfar driver, it's buggy and unlikely
    to get fixed any time soon.  From Jakub Kicinski

* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (61 commits)
  cosa: missing error code on failure in probe()
  gianfar: remove faulty filer optimizer
  gianfar: correct list membership accounting
  gianfar: correct filer table writing
  bonding: Gratuitous ARP gets dropped when first slave added
  net: dsa: Do not override PHY interface if already configured
  net: fs_enet: mask interrupts for TX partial frames.
  net: fs_enet: explicitly remove I flag on TX partial frames
  inet: fix possible request socket leak
  inet: fix races with reqsk timers
  mkiss: Fix error handling in mkiss_open()
  bnx2x: Free NVRAM lock at end of each page
  bnx2x: Prevent null pointer dereference on SKB release
  cxgb4: missing curly braces in t4_setup_debugfs()
  net-timestamp: Update skb_complete_tx_timestamp comment
  ipv6: don't reject link-local nexthop on other interface
  netlink: make sure -EBUSY won't escape from netlink_insert
  bna: fix interrupts storm caused by erroneous packets
  net: mvpp2: replace TX coalescing interrupts with hrtimer
  net: mvpp2: enable proper per-CPU TX buffers unmapping
  ...
2015-08-13 10:46:39 -07:00
Benjamin Poirier
7a76a021cd net-timestamp: Update skb_complete_tx_timestamp comment
After "62bccb8 net-timestamp: Make the clone operation stand-alone from phy
timestamping" the hwtstamps parameter of skb_complete_tx_timestamp() may no
longer be NULL.

Signed-off-by: Benjamin Poirier <bpoirier@suse.com>
Cc: Alexander Duyck <alexander.h.duyck@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-08-10 13:35:37 -07:00
Kaixu Xia
35578d7984 bpf: Implement function bpf_perf_event_read() that get the selected hardware PMU conuter
According to the perf_event_map_fd and index, the function
bpf_perf_event_read() can convert the corresponding map
value to the pointer to struct perf_event and return the
Hardware PMU counter value.

Signed-off-by: Kaixu Xia <xiakaixu@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-08-09 22:50:06 -07:00
Kaixu Xia
ea317b267e bpf: Add new bpf map type to store the pointer to struct perf_event
Introduce a new bpf map type 'BPF_MAP_TYPE_PERF_EVENT_ARRAY'.
This map only stores the pointer to struct perf_event. The
user space event FDs from perf_event_open() syscall are converted
to the pointer to struct perf_event and stored in map.

Signed-off-by: Kaixu Xia <xiakaixu@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-08-09 22:50:05 -07:00
Wang Nan
2a36f0b92e bpf: Make the bpf_prog_array_map more generic
All the map backends are of generic nature. In order to avoid
adding much special code into the eBPF core, rewrite part of
the bpf_prog_array map code and make it more generic. So the
new perf_event_array map type can reuse most of code with
bpf_prog_array map and add fewer lines of special code.

Signed-off-by: Wang Nan <wangnan0@huawei.com>
Signed-off-by: Kaixu Xia <xiakaixu@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-08-09 22:50:05 -07:00
Kaixu Xia
ffe8690c85 perf: add the necessary core perf APIs when accessing events counters in eBPF programs
This patch add three core perf APIs:
 - perf_event_attrs(): export the struct perf_event_attr from struct
   perf_event;
 - perf_event_get(): get the struct perf_event from the given fd;
 - perf_event_read_local(): read the events counters active on the
   current CPU;
These APIs are needed when accessing events counters in eBPF programs.

The API perf_event_read_local() comes from Peter and I add the
corresponding SOB.

Signed-off-by: Kaixu Xia <xiakaixu@huawei.com>
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-08-09 22:50:05 -07:00
Andreas Schultz
3499abb249 netfilter: nfacct: per network namespace support
- Move the nfnl_acct_list into the network namespace, initialize
  and destroy it per namespace
- Keep track of refcnt on nfacct objects, the old logic does not
  longer work with a per namespace list
- Adjust xt_nfacct to pass the namespace when registring objects

Signed-off-by: Andreas Schultz <aschultz@tpip.net>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2015-08-07 11:50:56 +02:00
Jason A. Donenfeld
d92cff89a0 net_dbg_ratelimited: turn into no-op when !DEBUG
The pr_debug family of functions turns into a no-op when -DDEBUG is not
specified, opting instead to call "no_printk", which gets compiled to a
no-op (but retains gcc's nice warnings about printf-style arguments).

The problem with net_dbg_ratelimited is that it is defined to be a
variant of net_ratelimited_function, which expands to essentially:

    if (net_ratelimit())
        pr_debug(fmt, ...);

When DEBUG is not defined, then this becomes,

    if (net_ratelimit())
        ;

This seems benign, except it isn't. Firstly, there's the obvious
overhead of calling net_ratelimit needlessly, which does quite some book
keeping for the rate limiting. Given that the pr_debug and
net_dbg_ratelimited family of functions are sprinkled liberally through
performance critical code, with developers assuming they'll be compiled
out to a no-op most of the time, we certainly do not want this needless
book keeping. Secondly, and most visibly, even though no debug message
is printed when DEBUG is not defined, if there is a flood of
invocations, dmesg winds up peppered with messages such as
"net_ratelimit: 320 callbacks suppressed". This is because our
aforementioned net_ratelimit() function actually prints this text in
some circumstances. It's especially odd to see this when there isn't any
other accompanying debug message.

So, in sum, it doesn't make sense to have this function's current
behavior, and instead it should match what every other debug family of
functions in the kernel does with !DEBUG -- nothing.

This patch replaces calls to net_dbg_ratelimited when !DEBUG with
no_printk, keeping with the idiom of all the other debug print helpers.

Also, though not strictly neccessary, it guards the call with an if (0)
so that all evaluation of any arguments are sure to be compiled out.

Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-08-06 23:51:30 -07:00
Gal Pressman
efea389d3c net/mlx5_core: Support physical port counters
Added physical port counters in the following standard formats to
ethtool statistics:
  - IEEE 802.3
  - RFC2863
  - RFC2819

Signed-off-by: Gal Pressman <galp@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Signed-off-by: Amir Vadai <amirv@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-08-06 22:00:59 -07:00
Achiad Shochat
5c50368f38 net/mlx5e: Light-weight netdev open/stop
Create/destroy TIRs, TISs and flow tables upon PCI probe/remove rather
than upon the netdev ndo_open/stop.

Upon ndo_stop(), redirect all RX traffic to the (lately introduced)
"Drop RQ" and then close only the RX/TX rings, leaving the TIRs,
TISs and flow tables alive.

Signed-off-by: Achiad Shochat <achiad@mellanox.com>
Signed-off-by: Amir Vadai <amirv@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-08-06 22:00:58 -07:00