commit f994ec5a5d8094a195c23b08d373b055ef0044aa Author: Greg Kroah-Hartman Date: Mon Apr 14 06:48:24 2014 -0700 Linux 3.13.10 commit bb59d7ebdf66319846943886e0d0d0ea05a2f9f6 Author: Ard Biesheuvel Date: Thu Mar 27 18:14:40 2014 +0100 crypto: ghash-clmulni-intel - use C implementation for setkey() commit 8ceee72808d1ae3fb191284afc2257a2be964725 upstream. The GHASH setkey() function uses SSE registers but fails to call kernel_fpu_begin()/kernel_fpu_end(). Instead of adding these calls, and then having to deal with the restriction that they cannot be called from interrupt context, move the setkey() implementation to the C domain. Note that setkey() does not use any particular SSE features and is not expected to become a performance bottleneck. Signed-off-by: Ard Biesheuvel Acked-by: H. Peter Anvin Fixes: 0e1227d356e9b (crypto: ghash - Add PCLMULQDQ accelerated implementation) Signed-off-by: Herbert Xu Signed-off-by: Greg Kroah-Hartman commit a055a7bb7316773d8fb916cb990e238eb7ed67f5 Author: Finn Thain Date: Thu Mar 6 10:29:27 2014 +1100 m68k: Skip futex_atomic_cmpxchg_inatomic() test commit e571c58f313d35c56e0018470e3375ddd1fd320e upstream. Skip the futex_atomic_cmpxchg_inatomic() test in futex_init(). It causes a fatal exception on 68030 (and presumably 68020 also). Signed-off-by: Finn Thain Acked-by: Geert Uytterhoeven Link: http://lkml.kernel.org/r/alpine.LNX.2.00.1403061006440.5525@nippy.intranet Signed-off-by: Thomas Gleixner Signed-off-by: Greg Kroah-Hartman commit 12ce0d10f491b4a377333398d5938aaf19e0d647 Author: Heiko Carstens Date: Sun Mar 2 13:09:47 2014 +0100 futex: Allow architectures to skip futex_atomic_cmpxchg_inatomic() test commit 03b8c7b623c80af264c4c8d6111e5c6289933666 upstream. If an architecture has futex_atomic_cmpxchg_inatomic() implemented and there is no runtime check necessary, allow to skip the test within futex_init(). This allows to get rid of some code which would always give the same result, and also allows the compiler to optimize a couple of if statements away. Signed-off-by: Heiko Carstens Cc: Finn Thain Cc: Geert Uytterhoeven Link: http://lkml.kernel.org/r/20140302120947.GA3641@osiris Signed-off-by: Thomas Gleixner [geert: Backported to v3.10..v3.13] Signed-off-by: Geert Uytterhoeven Signed-off-by: Greg Kroah-Hartman commit e75c66297379f26c0a2c112b9c8d302ef0dd0e96 Author: Vineet Gupta Date: Sat Apr 5 15:30:22 2014 +0530 ARC: [nsimosci] Unbork console commit 61fb4bfc010b0d2940f7fd87acbce6a0f03217cb upstream. Despite the switch to right UART driver (prev patch), serial console still doesn't work due to missing CONFIG_SERIAL_OF_PLATFORM Also fix the default cmdline in DT to not refer to out-of-tree ARC framebuffer driver for console. Signed-off-by: Vineet Gupta Cc: Francois Bedard Signed-off-by: Greg Kroah-Hartman commit 08a7b3c3749c9696ae85ea0d89c10a44dd44552b Author: Mischa Jonker Date: Thu May 16 19:36:08 2013 +0200 ARC: [nsimosci] Change .dts to use generic 8250 UART commit 6eda477b3c54b8236868c8784e5e042ff14244f0 upstream. The Synopsys APB DW UART has a couple of special features that are not in the System C model. In 3.8, the 8250_dw driver didn't really use these features, but from 3.9 onwards, the 8250_dw driver has become incompatible with our model. Signed-off-by: Mischa Jonker Signed-off-by: Vineet Gupta Cc: Francois Bedard Signed-off-by: Greg Kroah-Hartman commit b2c4b9547b22398498fd890c7f81db67ed0d8554 Author: Mikulas Patocka Date: Wed Dec 11 19:39:19 2013 -0500 powernow-k6: reorder frequencies commit 22c73795b101597051924556dce019385a1e2fa0 upstream. This patch reorders reported frequencies from the highest to the lowest, just like in other frequency drivers. Signed-off-by: Mikulas Patocka Acked-by: Viresh Kumar Signed-off-by: Rafael J. Wysocki Signed-off-by: Greg Kroah-Hartman commit 2420afb95ad6577f961dc64c9463ee8abe0c6f4b Author: Mikulas Patocka Date: Wed Dec 11 19:38:53 2013 -0500 powernow-k6: correctly initialize default parameters commit d82b922a4acc1781d368aceac2f9da43b038cab2 upstream. The powernow-k6 driver used to read the initial multiplier from the powernow register. However, there is a problem with this: * If there was a frequency transition before, the multiplier read from the register corresponds to the current multiplier. * If there was no frequency transition since reset, the field in the register always reads as zero, regardless of the current multiplier that is set using switches on the mainboard and that the CPU is running at. The zero value corresponds to multiplier 4.5, so as a consequence, the powernow-k6 driver always assumes multiplier 4.5. For example, if we have 550MHz CPU with bus frequency 100MHz and multiplier 5.5, the powernow-k6 driver thinks that the multiplier is 4.5 and bus frequency is 122MHz. The powernow-k6 driver then sets the multiplier to 4.5, underclocking the CPU to 450MHz, but reports the current frequency as 550MHz. There is no reliable way how to read the initial multiplier. I modified the driver so that it contains a table of known frequencies (based on parameters of existing CPUs and some common overclocking schemes) and sets the multiplier according to the frequency. If the frequency is unknown (because of unusual overclocking or underclocking), the user must supply the bus speed and maximum multiplier as module parameters. This patch should be backported to all stable kernels. If it doesn't apply cleanly, change it, or ask me to change it. Signed-off-by: Mikulas Patocka Signed-off-by: Rafael J. Wysocki Signed-off-by: Greg Kroah-Hartman commit d6d2e9388f12580deb78a9055f26c1e224fd3761 Author: Mikulas Patocka Date: Wed Dec 11 19:38:32 2013 -0500 powernow-k6: disable cache when changing frequency commit e20e1d0ac02308e2211306fc67abcd0b2668fb8b upstream. I found out that a system with k6-3+ processor is unstable during network server load. The system locks up or the network card stops receiving. The reason for the instability is the CPU frequency scaling. During frequency transition the processor is in "EPM Stop Grant" state. The documentation says that the processor doesn't respond to inquiry requests in this state. Consequently, coherency of processor caches and bus master devices is not maintained, causing the system instability. This patch flushes the cache during frequency transition. It fixes the instability. Other minor changes: * u64 invalue changed to unsigned long because the variable is 32-bit * move the logic to set the multiplier to a separate function powernow_k6_set_cpu_multiplier * preserve lower 5 bits of the powernow port instead of 4 (the voltage field has 5 bits) * mask interrupts when reading the multiplier, so that the port is not open during other activity (running other kernel code with the port open shouldn't cause any misbehavior, but we should better be safe and keep the port closed) This patch should be backported to all stable kernels. If it doesn't apply cleanly, change it, or ask me to change it. Signed-off-by: Mikulas Patocka Signed-off-by: Rafael J. Wysocki Signed-off-by: Greg Kroah-Hartman commit 5986352314c939fdf1108ae32c6e4a799029c92b Author: Sasha Levin Date: Sat Mar 29 20:39:35 2014 -0400 rds: prevent dereference of a NULL device in rds_iw_laddr_check [ Upstream commit bf39b4247b8799935ea91d90db250ab608a58e50 ] Binding might result in a NULL device which is later dereferenced without checking. Signed-off-by: Sasha Levin Signed-off-by: David S. Miller Signed-off-by: Greg Kroah-Hartman commit deb2eaa1b2f3a6f7fa0b3fa2c45751bd3b99d163 Author: Dan Carpenter Date: Tue Apr 8 12:23:09 2014 +0300 isdnloop: several buffer overflows [ Upstream commit 7563487cbf865284dcd35e9ef5a95380da046737 ] There are three buffer overflows addressed in this patch. 1) In isdnloop_fake_err() we add an 'E' to a 60 character string and then copy it into a 60 character buffer. I have made the destination buffer 64 characters and I'm changed the sprintf() to a snprintf(). 2) In isdnloop_parse_cmd(), p points to a 6 characters into a 60 character buffer so we have 54 characters. The ->eazlist[] is 11 characters long. I have modified the code to return if the source buffer is too long. 3) In isdnloop_command() the cbuf[] array was 60 characters long but the max length of the string then can be up to 79 characters. I made the cbuf array 80 characters long and changed the sprintf() to snprintf(). I also removed the temporary "dial" buffer and changed it to use "p" directly. Unfortunately, we pass the "cbuf" string from isdnloop_command() to isdnloop_writecmd() which truncates anything over 60 characters to make it fit in card->omsg[]. (It can accept values up to 255 characters so long as there is a '\n' character every 60 characters). For now I have just fixed the memory corruption bug and left the other problems in this driver alone. Signed-off-by: Dan Carpenter Signed-off-by: David S. Miller Signed-off-by: Greg Kroah-Hartman commit 4fbd2c2ef76cd24293a2ca8df0b4ae764db7b8d0 Author: YOSHIFUJI Hideaki Date: Wed Apr 2 12:48:42 2014 +0900 isdnloop: Validate NUL-terminated strings from user. [ Upstream commit 77bc6bed7121936bb2e019a8c336075f4c8eef62 ] Return -EINVAL unless all of user-given strings are correctly NUL-terminated. Signed-off-by: YOSHIFUJI Hideaki Signed-off-by: David S. Miller Signed-off-by: Greg Kroah-Hartman commit c95176f32b88f95a0d6592ded5387d4cf076d83b Author: Mike Rapoport Date: Tue Apr 1 09:23:01 2014 +0300 net: vxlan: fix crash when interface is created with no group [ Upstream commit 5933a7bbb5de66482ea8aa874a7ebaf8e67603c4 ] If the vxlan interface is created without explicit group definition, there are corner cases which may cause kernel panic. For instance, in the following scenario: node A: $ ip link add dev vxlan42 address 2c:c2:60:00:10:20 type vxlan id 42 $ ip addr add dev vxlan42 10.0.0.1/24 $ ip link set up dev vxlan42 $ arp -i vxlan42 -s 10.0.0.2 2c:c2:60:00:01:02 $ bridge fdb add dev vxlan42 to 2c:c2:60:00:01:02 dst $ ping 10.0.0.2 node B: $ ip link add dev vxlan42 address 2c:c2:60:00:01:02 type vxlan id 42 $ ip addr add dev vxlan42 10.0.0.2/24 $ ip link set up dev vxlan42 $ arp -i vxlan42 -s 10.0.0.1 2c:c2:60:00:10:20 node B crashes: vxlan42: 2c:c2:60:00:10:20 migrated from 4011:eca4:c0a8:6466:c0a8:6415:8e09:2118 to (invalid address) vxlan42: 2c:c2:60:00:10:20 migrated from 4011:eca4:c0a8:6466:c0a8:6415:8e09:2118 to (invalid address) BUG: unable to handle kernel NULL pointer dereference at 0000000000000046 IP: [] ip6_route_output+0x58/0x82 PGD 7bd89067 PUD 7bd4e067 PMD 0 Oops: 0000 [#1] SMP Modules linked in: CPU: 1 PID: 0 Comm: swapper/1 Not tainted 3.14.0-rc8-hvx-xen-00019-g97a5221-dirty #154 Hardware name: Bochs Bochs, BIOS Bochs 01/01/2011 task: ffff88007c774f50 ti: ffff88007c79c000 task.ti: ffff88007c79c000 RIP: 0010:[] [] ip6_route_output+0x58/0x82 RSP: 0018:ffff88007fd03668 EFLAGS: 00010282 RAX: 0000000000000000 RBX: ffffffff8186a000 RCX: 0000000000000040 RDX: 0000000000000000 RSI: ffff88007b0e4a80 RDI: ffff88007fd03754 RBP: ffff88007fd03688 R08: ffff88007b0e4a80 R09: 0000000000000000 R10: 0200000a0100000a R11: 0001002200000000 R12: ffff88007fd03740 R13: ffff88007b0e4a80 R14: ffff88007b0e4a80 R15: ffff88007bba0c50 FS: 0000000000000000(0000) GS:ffff88007fd00000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b CR2: 0000000000000046 CR3: 000000007bb60000 CR4: 00000000000006e0 Stack: 0000000000000000 ffff88007fd037a0 ffffffff8186a000 ffff88007fd03740 ffff88007fd036c8 ffffffff814320bb 0000000000006e49 ffff88007b8b7360 ffff88007bdbf200 ffff88007bcbc000 ffff88007b8b7000 ffff88007b8b7360 Call Trace: [] ip6_dst_lookup_tail+0x2d/0xa4 [] ip6_dst_lookup+0x10/0x12 [] vxlan_xmit_one+0x32a/0x68c [] ? _raw_spin_unlock_irqrestore+0x12/0x14 [] ? lock_timer_base.isra.23+0x26/0x4b [] vxlan_xmit+0x66a/0x6a8 [] ? ipt_do_table+0x35f/0x37e [] ? selinux_ip_postroute+0x41/0x26e [] dev_hard_start_xmit+0x2ce/0x3ce [] __dev_queue_xmit+0x2d0/0x392 [] ? eth_header+0x28/0xb5 [] dev_queue_xmit+0xb/0xd [] neigh_resolve_output+0x134/0x152 [] ip_finish_output2+0x236/0x299 [] ip_finish_output+0x98/0x9d [] ip_output+0x62/0x67 [] dst_output+0xf/0x11 [] ip_local_out+0x1b/0x1f [] ip_send_skb+0x11/0x37 [] ip_push_pending_frames+0x2f/0x33 [] icmp_push_reply+0x106/0x115 [] icmp_reply+0x142/0x164 [] icmp_echo.part.16+0x46/0x48 [] ? nf_iterate+0x43/0x80 [] ? xfrm4_policy_check.constprop.11+0x52/0x52 [] icmp_echo+0x25/0x27 [] icmp_rcv+0x1d2/0x20a [] ? xfrm4_policy_check.constprop.11+0x52/0x52 [] ip_local_deliver_finish+0xd6/0x14f [] ? xfrm4_policy_check.constprop.11+0x52/0x52 [] NF_HOOK.constprop.10+0x4c/0x53 [] ip_local_deliver+0x4a/0x4f [] ip_rcv_finish+0x253/0x26a [] ? inet_add_protocol+0x3e/0x3e [] NF_HOOK.constprop.10+0x4c/0x53 [] ip_rcv+0x2a6/0x2ec [] __netif_receive_skb_core+0x43e/0x478 [] ? virtqueue_poll+0x16/0x27 [] __netif_receive_skb+0x55/0x5a [] process_backlog+0x76/0x12f [] net_rx_action+0xa2/0x1ab [] __do_softirq+0xca/0x1d1 [] irq_exit+0x3e/0x85 [] do_IRQ+0xa9/0xc4 [] common_interrupt+0x6d/0x6d [] ? native_safe_halt+0x6/0x8 [] default_idle+0x9/0xd [] arch_cpu_idle+0x13/0x1c [] cpu_startup_entry+0xbc/0x137 [] start_secondary+0x1a0/0x1a5 Code: 24 14 e8 f1 e5 01 00 31 d2 a8 32 0f 95 c2 49 8b 44 24 2c 49 0b 44 24 24 74 05 83 ca 04 eb 1c 4d 85 ed 74 17 49 8b 85 a8 02 00 00 <66> 8b 40 46 66 c1 e8 07 83 e0 07 c1 e0 03 09 c2 4c 89 e6 48 89 RIP [] ip6_route_output+0x58/0x82 RSP CR2: 0000000000000046 ---[ end trace 4612329caab37efd ]--- When vxlan interface is created without explicit group definition, the default_dst protocol family is initialiazed to AF_UNSPEC and the driver assumes IPv4 configuration. On the other side, the default_dst protocol family is used to differentiate between IPv4 and IPv6 cases and, since, AF_UNSPEC != AF_INET, the processing takes the IPv6 path. Making the IPv4 assumption explicit by settting default_dst protocol family to AF_INET4 and preventing mixing of IPv4 and IPv6 addresses in snooped fdb entries fixes the corner case crashes. Signed-off-by: Mike Rapoport Signed-off-by: David S. Miller Signed-off-by: Greg Kroah-Hartman commit 15c3d765f97e2d563182184f6446ba6afbeadfbf Author: Daniel Pieczko Date: Tue Apr 1 13:10:34 2014 +0100 Call efx_set_channels() before efx->type->dimension_resources() [ Upstream commit 52ad762b85ed7947ec9eff6b036eb985352f6874 ] When using the "separate_tx_channels=1" module parameter, the TX queues are initially numbered starting from the first TX-only channel number (after all the RX-only channels). efx_set_channels() renumbers the queues so that they are indexed from zero. On EF10, the TX queues need to be relabelled in this way before calling the dimension_resources NIC type operation, otherwise the TX queue PIO buffers can be linked to the wrong VIs when using "separate_tx_channels=1". Added comments to explain UC/WC mappings for PIO buffers Signed-off-by: Shradha Shah Signed-off-by: David S. Miller Signed-off-by: Greg Kroah-Hartman commit 0e4a1bc23a2243f3f6700b777b6f4dac52090437 Author: Wei Liu Date: Tue Apr 1 12:46:12 2014 +0100 xen-netback: disable rogue vif in kthread context [ Upstream commit e9d8b2c2968499c1f96563e6522c56958d5a1d0d ] When netback discovers frontend is sending malformed packet it will disables the interface which serves that frontend. However disabling a network interface involving taking a mutex which cannot be done in softirq context, so we need to defer this process to kthread context. This patch does the following: 1. introduce a flag to indicate the interface is disabled. 2. check that flag in TX path, don't do any work if it's true. 3. check that flag in RX path, turn off that interface if it's true. The reason to disable it in RX path is because RX uses kthread. After this change the behavior of netback is still consistent -- it won't do any TX work for a rogue frontend, and the interface will be eventually turned off. Also change a "continue" to "break" after xenvif_fatal_tx_err, as it doesn't make sense to continue processing packets if frontend is rogue. This is a fix for XSA-90. Reported-by: Török Edwin Signed-off-by: Wei Liu Cc: Ian Campbell Reviewed-by: David Vrabel Acked-by: Ian Campbell Signed-off-by: David S. Miller Signed-off-by: Greg Kroah-Hartman commit 5c4f2fdaf68b2f9ec8f445a0c5a138ee6c4f6cd9 Author: Pablo Neira Date: Tue Apr 1 19:38:44 2014 +0200 netlink: don't compare the nul-termination in nla_strcmp [ Upstream commit 8b7b932434f5eee495b91a2804f5b64ebb2bc835 ] nla_strcmp compares the string length plus one, so it's implicitly including the nul-termination in the comparison. int nla_strcmp(const struct nlattr *nla, const char *str) { int len = strlen(str) + 1; ... d = memcmp(nla_data(nla), str, len); However, if NLA_STRING is used, userspace can send us a string without the nul-termination. This is a problem since the string comparison will not match as the last byte may be not the nul-termination. Fix this by skipping the comparison of the nul-termination if the attribute data is nul-terminated. Suggested by Thomas Graf. Cc: Florian Westphal Cc: Thomas Graf Signed-off-by: Pablo Neira Ayuso Signed-off-by: David S. Miller Signed-off-by: Greg Kroah-Hartman commit 0f11c2a166af8686745f75170b7c1008c9b42761 Author: Hannes Frederic Sowa Date: Mon Mar 31 20:14:10 2014 +0200 ipv6: some ipv6 statistic counters failed to disable bh [ Upstream commit 43a43b6040165f7b40b5b489fe61a4cb7f8c4980 ] After commit c15b1ccadb323ea ("ipv6: move DAD and addrconf_verify processing to workqueue") some counters are now updated in process context and thus need to disable bh before doing so, otherwise deadlocks can happen on 32-bit archs. Fabio Estevam noticed this while while mounting a NFS volume on an ARM board. As a compensation for missing this I looked after the other *_STATS_BH and found three other calls which need updating: 1) icmp6_send: ip6_fragment -> icmpv6_send -> icmp6_send (error handling) 2) ip6_push_pending_frames: rawv6_sendmsg -> rawv6_push_pending_frames -> ... (only in case of icmp protocol with raw sockets in error handling) 3) ping6_v6_sendmsg (error handling) Fixes: c15b1ccadb323ea ("ipv6: move DAD and addrconf_verify processing to workqueue") Reported-by: Fabio Estevam Tested-by: Fabio Estevam Cc: Eric Dumazet Signed-off-by: Hannes Frederic Sowa Signed-off-by: David S. Miller Signed-off-by: Greg Kroah-Hartman commit cadf431bb527964be519675e560fd61db498fe98 Author: Paul Durrant Date: Fri Mar 28 11:39:05 2014 +0000 xen-netback: remove pointless clause from if statement [ Upstream commit 0576eddf24df716d8570ef8ca11452a9f98eaab2 ] This patch removes a test in start_new_rx_buffer() that checks whether a copy operation is less than MAX_BUFFER_OFFSET in length, since MAX_BUFFER_OFFSET is defined to be PAGE_SIZE and the only caller of start_new_rx_buffer() already limits copy operations to PAGE_SIZE or less. Signed-off-by: Paul Durrant Cc: Ian Campbell Cc: Wei Liu Cc: Sander Eikelenboom Reported-By: Sander Eikelenboom Tested-By: Sander Eikelenboom Signed-off-by: David S. Miller Signed-off-by: Greg Kroah-Hartman commit 569e4b6489af7e15a10c543b472356a1724a1491 Author: Eric Dumazet Date: Thu Mar 27 07:19:19 2014 -0700 tcp: fix get_timewait4_sock() delay computation on 64bit [ Upstream commit e2a1d3e47bb904082b758dec9d07edf241c45d05 ] It seems I missed one change in get_timewait4_sock() to compute the remaining time before deletion of IPV4 timewait socket. This could result in wrong output in /proc/net/tcp for tm->when field. Fixes: 96f817fedec4 ("tcp: shrink tcp6_timewait_sock by one cache line") Signed-off-by: Eric Dumazet Signed-off-by: David S. Miller Signed-off-by: Greg Kroah-Hartman commit f0213f63e141fd6f023e263180db1ce91ef3cb55 Author: Michael S. Tsirkin Date: Thu Mar 27 12:53:37 2014 +0200 vhost: validate vhost_get_vq_desc return value [ Upstream commit a39ee449f96a2cd44ce056d8a0a112211a9b1a1f ] vhost fails to validate negative error code from vhost_get_vq_desc causing a crash: we are using -EFAULT which is 0xfffffff2 as vector size, which exceeds the allocated size. The code in question was introduced in commit 8dd014adfea6f173c1ef6378f7e5e7924866c923 vhost-net: mergeable buffers support CVE-2014-0055 Signed-off-by: Michael S. Tsirkin Signed-off-by: David S. Miller Signed-off-by: Greg Kroah-Hartman commit 2c1e023a1882c841575a7647572196bd1586409c Author: Michael S. Tsirkin Date: Thu Mar 27 12:00:26 2014 +0200 vhost: fix total length when packets are too short [ Upstream commit d8316f3991d207fe32881a9ac20241be8fa2bad0 ] When mergeable buffers are disabled, and the incoming packet is too large for the rx buffer, get_rx_bufs returns success. This was intentional in order for make recvmsg truncate the packet and then handle_rx would detect err != sock_len and drop it. Unfortunately we pass the original sock_len to recvmsg - which means we use parts of iov not fully validated. Fix this up by detecting this overrun and doing packet drop immediately. CVE-2014-0077 Signed-off-by: Michael S. Tsirkin Signed-off-by: David S. Miller Signed-off-by: Greg Kroah-Hartman commit c69b36124135f7e7f2f9f82912480dd5428201d9 Author: Vlad Yasevich Date: Wed Mar 26 11:47:56 2014 -0400 vlan: Set hard_header_len according to available acceleration [ Upstream commit fc0d48b8fb449ca007b2057328abf736cb516168 ] Currently, if the card supports CTAG acceleration we do not account for the vlan header even if we are configuring an 8021AD vlan. This may not be best since we'll do software tagging for 8021AD which will cause data copy on skb head expansion Configure the length based on available hw offload capabilities and vlan protocol. CC: Patrick McHardy Signed-off-by: Vlad Yasevich Signed-off-by: David S. Miller Signed-off-by: Greg Kroah-Hartman commit 72e46f6e69c9ae676c5e0652be0a5d38de568ab5 Author: Oliver Neukum Date: Wed Mar 26 14:32:51 2014 +0100 usbnet: include wait queue head in device structure [ Upstream commit 14a0d635d18d0fb552dcc979d6d25106e6541f2e ] This fixes a race which happens by freeing an object on the stack. Quoting Julius: > The issue is > that it calls usbnet_terminate_urbs() before that, which temporarily > installs a waitqueue in dev->wait in order to be able to wait on the > tasklet to run and finish up some queues. The waiting itself looks > okay, but the access to 'dev->wait' is totally unprotected and can > race arbitrarily. I think in this case usbnet_bh() managed to succeed > it's dev->wait check just before usbnet_terminate_urbs() sets it back > to NULL. The latter then finishes and the waitqueue_t structure on its > stack gets overwritten by other functions halfway through the > wake_up() call in usbnet_bh(). The fix is to just not allocate the data structure on the stack. As dev->wait is abused as a flag it also takes a runtime PM change to fix this bug. Signed-off-by: Oliver Neukum Reported-by: Grant Grundler Tested-by: Grant Grundler Signed-off-by: David S. Miller Signed-off-by: Greg Kroah-Hartman commit 1cc4545b64bb70290712e12c4e306867ea3357c9 Author: Jason Wang Date: Wed Mar 26 13:03:00 2014 +0800 virtio-net: correct error handling of virtqueue_kick() [ Upstream commit 681daee2443291419c57cccb0671f5f94a839005 ] Current error handling of virtqueue_kick() was wrong in two places: - The skb were freed immediately when virtqueue_kick() fail during xmit. This may lead double free since the skb was not detached from the virtqueue. - try_fill_recv() returns false when virtqueue_kick() fail. This will lead unnecessary rescheduling of refill work. Actually, it's safe to just ignore the kick failure in those two places. So this patch fixes this by partially revert commit 67975901183799af8e93ec60e322f9e2a1940b9b. Fixes 67975901183799af8e93ec60e322f9e2a1940b9b (virtio_net: verify if virtqueue_kick() succeeded). Cc: Heinz Graalfs Cc: Rusty Russell Cc: Michael S. Tsirkin Signed-off-by: Jason Wang Acked-by: Michael S. Tsirkin Signed-off-by: David S. Miller Signed-off-by: Greg Kroah-Hartman commit 5ff906bf6e0a6e404aec88f27101904159d24508 Author: Vlad Yasevich Date: Mon Mar 24 17:52:12 2014 -0400 tg3: Do not include vlan acceleration features in vlan_features [ Upstream commit 51dfe7b944998eaeb2b34d314f3a6b16a5fd621b ] Including hardware acceleration features in vlan_features breaks stacked vlans (Q-in-Q) by marking the bottom vlan interface as capable of acceleration. This causes one of the tags to be lost and the packets are sent with a sing vlan header. CC: Nithin Nayak Sujir CC: Michael Chan Signed-off-by: Vlad Yasevich Signed-off-by: David S. Miller Signed-off-by: Greg Kroah-Hartman commit 21ee270ccf2d5a91cabbd40a2ffc0d3ab80fedb9 Author: Pravin B Shelar Date: Sun Mar 23 22:06:36 2014 -0700 ip_tunnel: Fix dst ref-count. [ Upstream commit fbd02dd405d0724a0f25897ed4a6813297c9b96f ] Commit 10ddceb22ba (ip_tunnel:multicast process cause panic due to skb->_skb_refdst NULL pointer) removed dst-drop call from ip-tunnel-recv. Following commit reintroduce dst-drop and fix the original bug by checking loopback packet before releasing dst. Original bug: https://bugzilla.kernel.org/show_bug.cgi?id=70681 CC: Xin Long Signed-off-by: Pravin B Shelar Signed-off-by: David S. Miller Signed-off-by: Greg Kroah-Hartman commit e78e7aa146835ad539534fe98e573043a73797f1 Author: Erik Hugne Date: Mon Mar 24 16:56:38 2014 +0100 tipc: fix spinlock recursion bug for failed subscriptions [ Upstream commit a5d0e7c037119484a7006b883618bfa87996cb41 ] If a topology event subscription fails for any reason, such as out of memory, max number reached or because we received an invalid request the correct behavior is to terminate the subscribers connection to the topology server. This is currently broken and produces the following oops: [27.953662] tipc: Subscription rejected, illegal request [27.955329] BUG: spinlock recursion on CPU#1, kworker/u4:0/6 [27.957066] lock: 0xffff88003c67f408, .magic: dead4ead, .owner: kworker/u4:0/6, .owner_cpu: 1 [27.958054] CPU: 1 PID: 6 Comm: kworker/u4:0 Not tainted 3.14.0-rc6+ #5 [27.960230] Hardware name: Bochs Bochs, BIOS Bochs 01/01/2011 [27.960874] Workqueue: tipc_rcv tipc_recv_work [tipc] [27.961430] ffff88003c67f408 ffff88003de27c18 ffffffff815c0207 ffff88003de1c050 [27.962292] ffff88003de27c38 ffffffff815beec5 ffff88003c67f408 ffffffff817f0a8a [27.963152] ffff88003de27c58 ffffffff815beeeb ffff88003c67f408 ffffffffa0013520 [27.964023] Call Trace: [27.964292] [] dump_stack+0x45/0x56 [27.964874] [] spin_dump+0x8c/0x91 [27.965420] [] spin_bug+0x21/0x26 [27.965995] [] do_raw_spin_lock+0x116/0x140 [27.966631] [] _raw_spin_lock_bh+0x15/0x20 [27.967256] [] subscr_conn_shutdown_event+0x20/0xa0 [tipc] [27.968051] [] tipc_close_conn+0xa4/0xb0 [tipc] [27.968722] [] tipc_conn_terminate+0x1a/0x30 [tipc] [27.969436] [] subscr_conn_msg_event+0x1f2/0x2f0 [tipc] [27.970209] [] tipc_receive_from_sock+0x90/0xf0 [tipc] [27.970972] [] tipc_recv_work+0x29/0x50 [tipc] [27.971633] [] process_one_work+0x165/0x3e0 [27.972267] [] worker_thread+0x119/0x3a0 [27.972896] [] ? manage_workers.isra.25+0x2a0/0x2a0 [27.973622] [] kthread+0xdf/0x100 [27.974168] [] ? kthread_create_on_node+0x1a0/0x1a0 [27.974893] [] ret_from_fork+0x7c/0xb0 [27.975466] [] ? kthread_create_on_node+0x1a0/0x1a0 The recursion occurs when subscr_terminate tries to grab the subscriber lock, which is already taken by subscr_conn_msg_event. We fix this by checking if the request to establish a new subscription was successful, and if not we initiate termination of the subscriber after we have released the subscriber lock. Signed-off-by: Erik Hugne Reviewed-by: Jon Maloy Signed-off-by: David S. Miller Signed-off-by: Greg Kroah-Hartman commit ad52e55b70e250f8835efe5757cc49b3cb11388b Author: Li RongQing Date: Fri Mar 21 20:53:57 2014 +0800 netpoll: fix the skb check in pkt_is_ns [ Not applicable upstream commit, the code here has been removed upstream. ] Neighbor Solicitation is ipv6 protocol, so we should check skb->protocol with ETH_P_IPV6 Signed-off-by: Li RongQing Cc: WANG Cong Signed-off-by: Greg Kroah-Hartman commit 5bd0d8ba2538d16961e31c92d62cabaa1d3c9b58 Author: Nishanth Menon Date: Fri Mar 21 01:52:48 2014 -0500 net: micrel : ks8851-ml: add vdd-supply support [ Upstream commit ebf4ad955d3e26d4d2a33709624fc7b5b9d3b969 ] Few platforms use external regulator to keep the ethernet MAC supplied. So, request and enable the regulator for driver functionality. Fixes: 66fda75f47dc (regulator: core: Replace direct ops->disable usage) Reported-by: Russell King Suggested-by: Markus Pargmann Signed-off-by: Nishanth Menon Signed-off-by: David S. Miller Signed-off-by: Greg Kroah-Hartman commit 80171529b800b4f2e5be2ba4307cd6e94b18aef1 Author: Nicolas Dichtel Date: Wed Mar 19 17:47:51 2014 +0100 ip6mr: fix mfc notification flags [ Upstream commit f518338b16038beeb73e155e60d0f70beb9379f4 ] Commit 812e44dd1829 ("ip6mr: advertise new mfc entries via rtnl") reuses the function ip6mr_fill_mroute() to notify mfc events. But this function was used only for dump and thus was always setting the flag NLM_F_MULTI, which is wrong in case of a single notification. Libraries like libnl will wait forever for NLMSG_DONE. CC: Thomas Graf Signed-off-by: Nicolas Dichtel Acked-by: Thomas Graf Signed-off-by: David S. Miller Signed-off-by: Greg Kroah-Hartman commit 8605c10e5859a56079d52aa15c6d1b4121d4eab7 Author: Nicolas Dichtel Date: Wed Mar 19 17:47:50 2014 +0100 ipmr: fix mfc notification flags [ Upstream commit 65886f439ab0fdc2dff20d1fa87afb98c6717472 ] Commit 8cd3ac9f9b7b ("ipmr: advertise new mfc entries via rtnl") reuses the function ipmr_fill_mroute() to notify mfc events. But this function was used only for dump and thus was always setting the flag NLM_F_MULTI, which is wrong in case of a single notification. Libraries like libnl will wait forever for NLMSG_DONE. CC: Thomas Graf Signed-off-by: Nicolas Dichtel Acked-by: Thomas Graf Signed-off-by: David S. Miller Signed-off-by: Greg Kroah-Hartman commit 69ccd9b2df19189ea45ce07675c4e00b0df732c2 Author: Nicolas Dichtel Date: Wed Mar 19 17:47:49 2014 +0100 rtnetlink: fix fdb notification flags [ Upstream commit 1c104a6bebf3c16b6248408b84f91d09ac8a26b6 ] Commit 3ff661c38c84 ("net: rtnetlink notify events for FDB NTF_SELF adds and deletes") reuses the function nlmsg_populate_fdb_fill() to notify fdb events. But this function was used only for dump and thus was always setting the flag NLM_F_MULTI, which is wrong in case of a single notification. Libraries like libnl will wait forever for NLMSG_DONE. CC: Thomas Graf Signed-off-by: Nicolas Dichtel Acked-by: Thomas Graf Signed-off-by: David S. Miller Signed-off-by: Greg Kroah-Hartman commit 652c3e3e38667d11e6e7ca204497930c24d059c3 Author: Eric Dumazet Date: Wed Mar 19 21:02:21 2014 -0700 tcp: syncookies: do not use getnstimeofday() [ Upstream commit 632623153196bf183a69686ed9c07eee98ff1bf8 ] While it is true that getnstimeofday() uses about 40 cycles if TSC is available, it can use 1600 cycles if hpet is the clocksource. Switch to get_jiffies_64(), as this is more than enough, and go back to 60 seconds periods. Fixes: 8c27bd75f04f ("tcp: syncookies: reduce cookie lifetime to 128 seconds") Signed-off-by: Eric Dumazet Cc: Florian Westphal Acked-by: Florian Westphal Signed-off-by: David S. Miller Signed-off-by: Greg Kroah-Hartman commit 3927dace523706cc00f808520eaf2125dd7c07b5 Author: David Stevens Date: Mon Mar 24 10:39:58 2014 -0400 vxlan: fix nonfunctional neigh_reduce() [ Upstream commit 4b29dba9c085a4fb79058fb1c45a2f6257ca3dfa ] The VXLAN neigh_reduce() code is completely non-functional since check-in. Specific errors: 1) The original code drops all packets with a multicast destination address, even though neighbor solicitations are sent to the solicited-node address, a multicast address. The code after this check was never run. 2) The neighbor table lookup used the IPv6 header destination, which is the solicited node address, rather than the target address from the neighbor solicitation. So neighbor lookups would always fail if it got this far. Also for L3MISSes. 3) The code calls ndisc_send_na(), which does a send on the tunnel device. The context for neigh_reduce() is the transmit path, vxlan_xmit(), where the host or a bridge-attached neighbor is trying to transmit a neighbor solicitation. To respond to it, the tunnel endpoint needs to do a *receive* of the appropriate neighbor advertisement. Doing a send, would only try to send the advertisement, encapsulated, to the remote destinations in the fdb -- hosts that definitely did not do the corresponding solicitation. 4) The code uses the tunnel endpoint IPv6 forwarding flag to determine the isrouter flag in the advertisement. This has nothing to do with whether or not the target is a router, and generally won't be set since the tunnel endpoint is bridging, not routing, traffic. The patch below creates a proxy neighbor advertisement to respond to neighbor solicitions as intended, providing proper IPv6 support for neighbor reduction. Signed-off-by: David L Stevens Signed-off-by: David S. Miller Signed-off-by: Greg Kroah-Hartman commit e1ef50212d122b0e775fb234b4c1523d52859acb Author: David Stevens Date: Tue Mar 18 12:32:29 2014 -0400 vxlan: fix potential NULL dereference in arp_reduce() [ Upstream commit 7346135dcd3f9b57f30a5512094848c678d7143e ] This patch fixes a NULL pointer dereference in the event of an skb allocation failure in arp_reduce(). Signed-Off-By: David L Stevens Acked-by: Cong Wang Signed-off-by: David S. Miller Signed-off-by: Greg Kroah-Hartman commit 7e2c8e661df7e27d6045822f29cf07e346053d6d Author: Bjørn Mork Date: Mon Mar 17 16:25:18 2014 +0100 net: cdc_ncm: fix control message ordering [ Upstream commit ff0992e9036e9810e7cd45234fa32ca1e79750e2 ] This is a context modified revert of commit 6a9612e2cb22 ("net: cdc_ncm: remove ncm_parm field") which introduced a NCM specification violation, causing setup errors for some devices. These errors resulted in the device and host disagreeing about shared settings, with complete failure to communicate as the end result. The NCM specification require that many of the NCM specific control reuests are sent only while the NCM Data Interface is in alternate setting 0. Reverting the commit ensures that we follow this requirement. Fixes: 6a9612e2cb22 ("net: cdc_ncm: remove ncm_parm field") Reported-and-tested-by: Pasi Kärkkäinen Reported-by: Thomas Schäfer Signed-off-by: Bjørn Mork Signed-off-by: David S. Miller Signed-off-by: Greg Kroah-Hartman commit 3e6a4ac48b3ae99739143b03076b341c0137a982 Author: lucien Date: Mon Mar 17 12:51:01 2014 +0800 ipv6: ip6_append_data_mtu do not handle the mtu of the second fragment properly [ Upstream commit e367c2d03dba4c9bcafad24688fadb79dd95b218 ] In ip6_append_data_mtu(), when the xfrm mode is not tunnel(such as transport),the ipsec header need to be added in the first fragment, so the mtu will decrease to reserve space for it, then the second fragment come, the mtu should be turn back, as the commit 0c1833797a5a6ec23ea9261d979aa18078720b74 said. however, in the commit a493e60ac4bbe2e977e7129d6d8cbb0dd236be, it use *mtu = min(*mtu, ...) to change the mtu, which lead to the new mtu is alway equal with the first fragment's. and cannot turn back. when I test through ping6 -c1 -s5000 $ip (mtu=1280): ...frag (0|1232) ESP(spi=0x00002000,seq=0xb), length 1232 ...frag (1232|1216) ...frag (2448|1216) ...frag (3664|1216) ...frag (4880|164) which should be: ...frag (0|1232) ESP(spi=0x00001000,seq=0x1), length 1232 ...frag (1232|1232) ...frag (2464|1232) ...frag (3696|1232) ...frag (4928|116) so delete the min() when change back the mtu. Signed-off-by: Xin Long Fixes: 75a493e60ac4bb ("ipv6: ip6_append_data_mtu did not care about pmtudisc and frag_size") Acked-by: Hannes Frederic Sowa Signed-off-by: David S. Miller Signed-off-by: Greg Kroah-Hartman commit 2e1540c3867a3b921c387c35101f49220c4efbd5 Author: Heiner Kallweit Date: Wed Mar 12 22:13:19 2014 +0100 ipv6: Avoid unnecessary temporary addresses being generated [ Upstream commit ecab67015ef6e3f3635551dcc9971cf363cc1cd5 ] tmp_prefered_lft is an offset to ifp->tstamp, not now. Therefore age needs to be added to the condition. Age calculation in ipv6_create_tempaddr is different from the one in addrconf_verify and doesn't consider ADDRCONF_TIMER_FUZZ_MINUS. This can cause age in ipv6_create_tempaddr to be less than the one in addrconf_verify and therefore unnecessary temporary address to be generated. Use age calculation as in addrconf_modify to avoid this. Signed-off-by: Heiner Kallweit Signed-off-by: David S. Miller Signed-off-by: Greg Kroah-Hartman commit 20ffb64d6f8f4dc93952f90142107f31d4298161 Author: Stefan Wahren Date: Wed Mar 12 11:28:19 2014 +0100 eth: fec: Fix lost promiscuous mode after reconnecting cable [ Upstream commit 84fe61821e4ebab6322eeae3f3c27f77f0031978 ] If the Freescale fec is in promiscuous mode and network cable is reconnected then the promiscuous mode get lost. The problem is caused by a too soon call of set_multicast_list to re-enable promisc mode. The FEC_R_CNTRL register changes are overwritten by fec_restart. This patch fixes this by moving the call behind the init of FEC_R_CNTRL register in fec_restart. Successful tested on a i.MX28 board. Signed-off-by: Stefan Wahren Signed-off-by: David S. Miller Signed-off-by: Greg Kroah-Hartman commit c0384f5d2d6582891e3229dd857378a7ffd77d0f Author: dingtianhong Date: Wed Mar 12 17:31:59 2014 +0800 bonding: set correct vlan id for alb xmit path [ Upstream commit fb00bc2e6cd2046282ba4b03f4fe682aee70b2f8 ] The commit d3ab3ffd1d728d7ee77340e7e7e2c7cfe6a4013e (bonding: use rlb_client_info->vlan_id instead of ->tag) remove the rlb_client_info->tag, but occur some issues, The vlan_get_tag() will return 0 for success and -EINVAL for error, so the client_info->vlan_id always be set to 0 if the vlan_get_tag return 0 for success, so the client_info would never get a correct vlan id. We should only set the vlan id to 0 when the vlan_get_tag return error. Fixes: d3ab3ffd1d7 (bonding: use rlb_client_info->vlan_id instead of ->tag) CC: Ding Tianhong CC: Jay Vosburgh CC: Andy Gospodarek Signed-off-by: Ding Tianhong Acked-by: Veaceslav Falico Signed-off-by: David S. Miller Signed-off-by: Greg Kroah-Hartman commit aaf5da50027d01a039ae3d3102ea8e62e4e33d7c Author: Matthew Leach Date: Tue Mar 11 11:58:27 2014 +0000 net: socket: error on a negative msg_namelen [ Upstream commit dbb490b96584d4e958533fb637f08b557f505657 ] When copying in a struct msghdr from the user, if the user has set the msg_namelen parameter to a negative value it gets clamped to a valid size due to a comparison between signed and unsigned values. Ensure the syscall errors when the user passes in a negative value. Signed-off-by: Matthew Leach Signed-off-by: David S. Miller Signed-off-by: Greg Kroah-Hartman commit a00acd077a49721fc399f89f4950c3dc597ef3d6 Author: Linus Lüssing Date: Mon Mar 10 22:25:25 2014 +0100 bridge: multicast: enable snooping on general queries only [ Upstream commit 20a599bec95a52fa72432b2376a2ce47c5bb68fb ] Without this check someone could easily create a denial of service by injecting multicast-specific queries to enable the bridge snooping part if no real querier issuing periodic general queries is present on the link which would result in the bridge wrongly shutting down ports for multicast traffic as the bridge did not learn about these listeners. With this patch the snooping code is enabled upon receiving valid, general queries only. Signed-off-by: Linus Lüssing Signed-off-by: David S. Miller Signed-off-by: Greg Kroah-Hartman commit 3bc6750d9d2f02fcf4b92408fdd624ad48c13491 Author: Linus Lüssing Date: Mon Mar 10 22:25:24 2014 +0100 bridge: multicast: add sanity check for general query destination [ Upstream commit 9ed973cc40c588abeaa58aea0683ea665132d11d ] General IGMP and MLD queries are supposed to have the multicast link-local all-nodes address as their destination according to RFC2236 section 9, RFC3376 section 4.1.12/9.1, RFC2710 section 8 and RFC3810 section 5.1.15. Without this check, such malformed IGMP/MLD queries can result in a denial of service: The queries are ignored by most IGMP/MLD listeners therefore they will not respond with an IGMP/MLD report. However, without this patch these malformed MLD queries would enable the snooping part in the bridge code, potentially shutting down the according ports towards these hosts for multicast traffic as the bridge did not learn about these listeners. Reported-by: Jan Stancek Signed-off-by: Linus Lüssing Signed-off-by: David S. Miller Signed-off-by: Greg Kroah-Hartman commit 426f46a7985905445cfedbb00e95d9417d530649 Author: Eric Dumazet Date: Mon Mar 10 09:50:11 2014 -0700 tcp: tcp_release_cb() should release socket ownership [ Upstream commit c3f9b01849ef3bc69024990092b9f42e20df7797 ] Lars Persson reported following deadlock : -000 |M:0x0:0x802B6AF8(asm) <-- arch_spin_lock -001 |tcp_v4_rcv(skb = 0x8BD527A0) <-- sk = 0x8BE6B2A0 -002 |ip_local_deliver_finish(skb = 0x8BD527A0) -003 |__netif_receive_skb_core(skb = 0x8BD527A0, ?) -004 |netif_receive_skb(skb = 0x8BD527A0) -005 |elk_poll(napi = 0x8C770500, budget = 64) -006 |net_rx_action(?) -007 |__do_softirq() -008 |do_softirq() -009 |local_bh_enable() -010 |tcp_rcv_established(sk = 0x8BE6B2A0, skb = 0x87D3A9E0, th = 0x814EBE14, ?) -011 |tcp_v4_do_rcv(sk = 0x8BE6B2A0, skb = 0x87D3A9E0) -012 |tcp_delack_timer_handler(sk = 0x8BE6B2A0) -013 |tcp_release_cb(sk = 0x8BE6B2A0) -014 |release_sock(sk = 0x8BE6B2A0) -015 |tcp_sendmsg(?, sk = 0x8BE6B2A0, ?, ?) -016 |sock_sendmsg(sock = 0x8518C4C0, msg = 0x87D8DAA8, size = 4096) -017 |kernel_sendmsg(?, ?, ?, ?, size = 4096) -018 |smb_send_kvec() -019 |smb_send_rqst(server = 0x87C4D400, rqst = 0x87D8DBA0) -020 |cifs_call_async() -021 |cifs_async_writev(wdata = 0x87FD6580) -022 |cifs_writepages(mapping = 0x852096E4, wbc = 0x87D8DC88) -023 |__writeback_single_inode(inode = 0x852095D0, wbc = 0x87D8DC88) -024 |writeback_sb_inodes(sb = 0x87D6D800, wb = 0x87E4A9C0, work = 0x87D8DD88) -025 |__writeback_inodes_wb(wb = 0x87E4A9C0, work = 0x87D8DD88) -026 |wb_writeback(wb = 0x87E4A9C0, work = 0x87D8DD88) -027 |wb_do_writeback(wb = 0x87E4A9C0, force_wait = 0) -028 |bdi_writeback_workfn(work = 0x87E4A9CC) -029 |process_one_work(worker = 0x8B045880, work = 0x87E4A9CC) -030 |worker_thread(__worker = 0x8B045880) -031 |kthread(_create = 0x87CADD90) -032 |ret_from_kernel_thread(asm) Bug occurs because __tcp_checksum_complete_user() enables BH, assuming it is running from softirq context. Lars trace involved a NIC without RX checksum support but other points are problematic as well, like the prequeue stuff. Problem is triggered by a timer, that found socket being owned by user. tcp_release_cb() should call tcp_write_timer_handler() or tcp_delack_timer_handler() in the appropriate context : BH disabled and socket lock held, but 'owned' field cleared, as if they were running from timer handlers. Fixes: 6f458dfb4092 ("tcp: improve latencies of timer triggered events") Reported-by: Lars Persson Tested-by: Lars Persson Signed-off-by: Eric Dumazet Signed-off-by: David S. Miller Signed-off-by: Greg Kroah-Hartman commit bb5393acc2ccb3cb0f94cf5f4374ef250b3bca40 Author: Michael S. Tsirkin Date: Mon Mar 10 19:28:08 2014 +0200 skbuff: skb_segment: orphan frags before copying [ Upstream commit 1fd819ecb90cc9b822cd84d3056ddba315d3340f ] skb_segment copies frags around, so we need to copy them carefully to avoid accessing user memory after reporting completion to userspace through a callback. skb_segment doesn't normally happen on datapath: TSO needs to be disabled - so disabling zero copy in this case does not look like a big deal. Signed-off-by: Michael S. Tsirkin Acked-by: Herbert Xu Signed-off-by: David S. Miller Signed-off-by: Greg Kroah-Hartman commit 0490e20d47bb1b1d26ddcce1d180338560784bf0 Author: Michael S. Tsirkin Date: Mon Mar 10 19:27:59 2014 +0200 skbuff: skb_segment: s/fskb/list_skb/ [ Upstream commit 1a4cedaf65491e66e1e55b8428c89209da729209 ] fskb is unrelated to frag: it's coming from frag_list. Rename it list_skb to avoid confusion. Signed-off-by: Michael S. Tsirkin Signed-off-by: David S. Miller Signed-off-by: Greg Kroah-Hartman commit e375025a39b44459f5200c84a6ba8ac7c45ac326 Author: Michael S. Tsirkin Date: Mon Mar 10 18:29:19 2014 +0200 skbuff: skb_segment: s/skb/head_skb/ [ Upstream commit df5771ffefb13f8af5392bd54fd7e2b596a3a357 ] rename local variable to make it easier to tell at a glance that we are dealing with a head skb. Signed-off-by: Michael S. Tsirkin Signed-off-by: David S. Miller Signed-off-by: Greg Kroah-Hartman commit b6d713db7651101cb2ce886e08a7c48658d85896 Author: Michael S. Tsirkin Date: Mon Mar 10 18:29:14 2014 +0200 skbuff: skb_segment: s/skb_frag/frag/ [ Upstream commit 4e1beba12d094c6c761ba5c49032b9b9e46380e8 ] skb_frag can in fact point at either skb or fskb so rename it generally "frag". Signed-off-by: Michael S. Tsirkin Signed-off-by: David S. Miller Signed-off-by: Greg Kroah-Hartman commit fe50c072d1326497ea9b1e7873d52382ff04c37a Author: Michael S. Tsirkin Date: Mon Mar 10 18:29:04 2014 +0200 skbuff: skb_segment: s/frag/nskb_frag/ [ Upstream commit 8cb19905e9287a93ce7c2cbbdf742a060b00e219 ] frag points at nskb, so name it appropriately Signed-off-by: Michael S. Tsirkin Signed-off-by: David S. Miller Signed-off-by: Greg Kroah-Hartman commit 92a887902f1ce91160183372e10c5ea7635efcd4 Author: Peter Boström Date: Mon Mar 10 16:17:15 2014 +0100 vlan: Set correct source MAC address with TX VLAN offload enabled [ Upstream commit dd38743b4cc2f86be250eaf156cf113ba3dd531a ] With TX VLAN offload enabled the source MAC address for frames sent using the VLAN interface is currently set to the address of the real interface. This is wrong since the VLAN interface may be configured with a different address. The bug was introduced in commit 2205369a314e12fcec4781cc73ac9c08fc2b47de ("vlan: Fix header ops passthru when doing TX VLAN offload."). This patch sets the source address before calling the create function of the real interface. Signed-off-by: Peter Boström Signed-off-by: David S. Miller Signed-off-by: Greg Kroah-Hartman commit 66295293f1c89b83290dc610db9f8fc555f5a92b Author: Annie Li Date: Mon Mar 10 22:58:34 2014 +0800 Xen-netback: Fix issue caused by using gso_type wrongly [ Upstream commit 5bd076708664313f2bdbbc1cf71093313b7774a1 ] Current netback uses gso_type to check whether the skb contains gso offload, and this is wrong. Gso_size is the right one to check gso existence, and gso_type is only used to check gso type. Some skbs contains nonzero gso_type and zero gso_size, current netback would treat these skbs as gso and create wrong response for this. This also causes ssh failure to domu from other server. V2: use skb_is_gso function as Paul Durrant suggested Signed-off-by: Annie Li Acked-by: Wei Liu Reviewed-by: Paul Durrant Signed-off-by: David S. Miller Signed-off-by: Greg Kroah-Hartman commit eafe827b3bff6faf6c108a7b24e4bb6a025bd514 Author: Eric Dumazet Date: Thu Mar 6 22:57:52 2014 -0800 pkt_sched: fq: do not hold qdisc lock while allocating memory [ Upstream commit 2d8d40afd187bced0a3d056366fb58d66fe845e3 ] Resizing fq hash table allocates memory while holding qdisc spinlock, with BH disabled. This is definitely not good, as allocation might sleep. We can drop the lock and get it when needed, we hold RTNL so no other changes can happen at the same time. Signed-off-by: Eric Dumazet Fixes: afe4fd062416 ("pkt_sched: fq: Fair Queue packet scheduler") Signed-off-by: David S. Miller Signed-off-by: Greg Kroah-Hartman commit 7cf38b2b8931f760e5eb4433f257a0b6202c9c5e Author: Michael Chan Date: Sun Mar 9 15:45:32 2014 -0800 bnx2: Fix shutdown sequence [ Upstream commit a8d9bc2e9f5d1c5a25e33cec096d2a1652d3fd52 ] The pci shutdown handler added in: bnx2: Add pci shutdown handler commit 25bfb1dd4ba3b2d9a49ce9d9b0cd7be1840e15ed created a shutdown down sequence without chip reset if the device was never brought up. This can cause the firmware to shutdown the PHY prematurely and cause MMIO read cycles to be unresponsive. On some systems, it may generate NMI in the bnx2's pci shutdown handler. The fix is to tell the firmware not to shutdown the PHY if there was no prior chip reset. Signed-off-by: Michael Chan Signed-off-by: David S. Miller Signed-off-by: Greg Kroah-Hartman commit 0e0e70fba39b7151196116409daf4d35e93d173c Author: Sabrina Dubroca Date: Thu Mar 6 17:51:57 2014 +0100 ipv6: don't set DST_NOCOUNT for remotely added routes [ Upstream commit c88507fbad8055297c1d1e21e599f46960cbee39 ] DST_NOCOUNT should only be used if an authorized user adds routes locally. In case of routes which are added on behalf of router advertisments this flag must not get used as it allows an unlimited number of routes getting added remotely. Signed-off-by: Sabrina Dubroca Acked-by: Hannes Frederic Sowa Signed-off-by: David S. Miller Signed-off-by: Greg Kroah-Hartman commit 0ceb6013f1a3b17eb187f598b3bf935e6c17d168 Author: Anton Nayshtut Date: Wed Mar 5 08:30:08 2014 +0200 ipv6: Fix exthdrs offload registration. [ Upstream commit d2d273ffabd315eecefce21a4391d44b6e156b73 ] Without this fix, ipv6_exthdrs_offload_init doesn't register IPPROTO_DSTOPTS offload, but returns 0 (as the IPPROTO_ROUTING registration actually succeeds). This then causes the ipv6_gso_segment to drop IPv6 packets with IPPROTO_DSTOPTS header. The issue detected and the fix verified by running MS HCK Offload LSO test on top of QEMU Windows guests, as this test sends IPv6 packets with IPPROTO_DSTOPTS. Signed-off-by: Anton Nayshtut Signed-off-by: David S. Miller Signed-off-by: Greg Kroah-Hartman commit a0110722e14d34d8ffbc273f209db1c86fb26de8 Author: Eric Dumazet Date: Tue Mar 25 18:42:27 2014 -0700 net: unix: non blocking recvmsg() should not return -EINTR [ Upstream commit de1443916791d75fdd26becb116898277bb0273f ] Some applications didn't expect recvmsg() on a non blocking socket could return -EINTR. This possibility was added as a side effect of commit b3ca9b02b00704 ("net: fix multithreaded signal handling in unix recv routines"). To hit this bug, you need to be a bit unlucky, as the u->readlock mutex is usually held for very small periods. Fixes: b3ca9b02b00704 ("net: fix multithreaded signal handling in unix recv routines") Signed-off-by: Eric Dumazet Cc: Rainer Weikusat Signed-off-by: David S. Miller Signed-off-by: Greg Kroah-Hartman commit 30183acef6ff2c3444de7a542f6f2748205ab21d Author: Florian Westphal Date: Thu Mar 6 18:06:41 2014 +0100 inet: frag: make sure forced eviction removes all frags [ Upstream commit e588e2f286ed7da011ed357c24c5b9a554e26595 ] Quoting Alexander Aring: While fragmentation and unloading of 6lowpan module I got this kernel Oops after few seconds: BUG: unable to handle kernel paging request at f88bbc30 [..] Modules linked in: ipv6 [last unloaded: 6lowpan] Call Trace: [] ? call_timer_fn+0x54/0xb3 [] ? process_timeout+0xa/0xa [] run_timer_softirq+0x140/0x15f Problem is that incomplete frags are still around after unload; when their frag expire timer fires, we get crash. When a netns is removed (also done when unloading module), inet_frag calls the evictor with 'force' argument to purge remaining frags. The evictor loop terminates when accounted memory ('work') drops to 0 or the lru-list becomes empty. However, the mem accounting is done via percpu counters and may not be accurate, i.e. loop may terminate prematurely. Alter evictor to only stop once the lru list is empty when force is requested. Reported-by: Phoebe Buckheister Reported-by: Alexander Aring Tested-by: Alexander Aring Signed-off-by: Florian Westphal Acked-by: Eric Dumazet Signed-off-by: David S. Miller Signed-off-by: Greg Kroah-Hartman commit fc9696c20861a1b3cb8e493344c35237839671e2 Author: Erik Hugne Date: Thu Mar 6 14:40:21 2014 +0100 tipc: don't log disabled tasklet handler errors [ Upstream commit 2892505ea170094f982516bb38105eac45f274b1 ] Failure to schedule a TIPC tasklet with tipc_k_signal because the tasklet handler is disabled is not an error. It means TIPC is currently in the process of shutting down. We remove the error logging in this case. Signed-off-by: Erik Hugne Reviewed-by: Jon Maloy Signed-off-by: David S. Miller Signed-off-by: Greg Kroah-Hartman commit 59caf2777566e6cb4ef45d092e87d4780db2943f Author: Erik Hugne Date: Thu Mar 6 14:40:20 2014 +0100 tipc: fix memory leak during module removal [ Upstream commit 1bb8dce57f4d15233688c68990852a10eb1cd79f ] When the TIPC module is removed, the tasklet handler is disabled before all other subsystems. This will cause lingering publications in the name table because the node_down tasklets responsible to clean up publications from an unreachable node will never run. When the name table is shut down, these publications are detected and an error message is logged: tipc: nametbl_stop(): orphaned hash chain detected This is actually a memory leak, introduced with commit 993b858e37b3120ee76d9957a901cca22312ffaa ("tipc: correct the order of stopping services at rmmod") Instead of just logging an error and leaking memory, we free the orphaned entries during nametable shutdown. Signed-off-by: Erik Hugne Reviewed-by: Jon Maloy Signed-off-by: David S. Miller Signed-off-by: Greg Kroah-Hartman commit 1b954f5307dec6871b1075537c22969e5786d1fc Author: Erik Hugne Date: Thu Mar 6 14:40:19 2014 +0100 tipc: drop subscriber connection id invalidation [ Upstream commit edcc0511b5ee7235282a688cd604e3ae7f9e1fc9 ] When a topology server subscriber is disconnected, the associated connection id is set to zero. A check vs zero is then done in the subscription timeout function to see if the subscriber have been shut down. This is unnecessary, because all subscription timers will be cancelled when a subscriber terminates. Setting the connection id to zero is actually harmful because id zero is the identity of the topology server listening socket, and can cause a race that leads to this socket being closed instead. Signed-off-by: Erik Hugne Acked-by: Ying Xue Reviewed-by: Jon Maloy Signed-off-by: David S. Miller Signed-off-by: Greg Kroah-Hartman commit d7b4510a84a79153054e0ae707be2d6ed8322440 Author: Ying Xue Date: Thu Mar 6 14:40:17 2014 +0100 tipc: fix connection refcount leak [ Upstream commit 4652edb70e8a7eebbe47fa931940f65522c36e8f ] When tipc_conn_sendmsg() calls tipc_conn_lookup() to query a connection instance, its reference count value is increased if it's found. But subsequently if it's found that the connection is closed, the work of sending message is not queued into its server send workqueue, and the connection reference count is not decreased. This will cause a reference count leak. To reproduce this problem, an application would need to open and closes topology server connections with high intensity. We fix this by immediately decrementing the connection reference count if a send fails due to the connection being closed. Signed-off-by: Ying Xue Acked-by: Erik Hugne Reviewed-by: Jon Maloy Signed-off-by: David S. Miller Signed-off-by: Greg Kroah-Hartman commit 3e28aa7519e2f20d048ca5743d1219a6474eb50e Author: Ying Xue Date: Thu Mar 6 14:40:16 2014 +0100 tipc: allow connection shutdown callback to be invoked in advance [ Upstream commit 6d4ebeb4df0176b1973875840a9f7e91394c0685 ] Currently connection shutdown callback function is called when connection instance is released in tipc_conn_kref_release(), and receiving packets and sending packets are running in different threads. Even if connection is closed by the thread of receiving packets, its shutdown callback may not be called immediately as the connection reference count is non-zero at that moment. So, although the connection is shut down by the thread of receiving packets, the thread of sending packets doesn't know it. Before its shutdown callback is invoked to tell the sending thread its connection has been closed, the sending thread may deliver messages by tipc_conn_sendmsg(), this is why the following error information appears: "Sending subscription event failed, no memory" To eliminate it, allow connection shutdown callback function to be called before connection id is removed in tipc_close_conn(), which makes the sending thread know the truth in time that its socket is closed so that it doesn't send message to it. We also remove the "Sending XXX failed..." error reporting for topology and config services. Signed-off-by: Ying Xue Signed-off-by: Erik Hugne Reviewed-by: Jon Maloy Signed-off-by: David S. Miller Signed-off-by: Greg Kroah-Hartman commit dbfc575804afb124cdc27dade949d602c4925a78 Author: Linus Lüssing Date: Tue Mar 4 03:57:35 2014 +0100 bridge: multicast: add sanity check for query source addresses [ Upstream commit 6565b9eeef194afbb3beec80d6dd2447f4091f8c ] MLD queries are supposed to have an IPv6 link-local source address according to RFC2710, section 4 and RFC3810, section 5.1.14. This patch adds a sanity check to ignore such broken MLD queries. Without this check, such malformed MLD queries can result in a denial of service: The queries are ignored by any MLD listener therefore they will not respond with an MLD report. However, without this patch these malformed MLD queries would enable the snooping part in the bridge code, potentially shutting down the according ports towards these hosts for multicast traffic as the bridge did not learn about these listeners. Reported-by: Jan Stancek Signed-off-by: Linus Lüssing Reviewed-by: Hannes Frederic Sowa Signed-off-by: David S. Miller Signed-off-by: Greg Kroah-Hartman commit 37612b6148d8932256188aeb98a412a82e27d078 Author: Daniel Borkmann Date: Tue Mar 4 16:35:51 2014 +0100 net: sctp: fix skb leakage in COOKIE ECHO path of chunk->auth_chunk [ Upstream commit c485658bae87faccd7aed540fd2ca3ab37992310 ] While working on ec0223ec48a9 ("net: sctp: fix sctp_sf_do_5_1D_ce to verify if we/peer is AUTH capable"), we noticed that there's a skb memory leakage in the error path. Running the same reproducer as in ec0223ec48a9 and by unconditionally jumping to the error label (to simulate an error condition) in sctp_sf_do_5_1D_ce() receive path lets kmemleak detector bark about the unfreed chunk->auth_chunk skb clone: Unreferenced object 0xffff8800b8f3a000 (size 256): comm "softirq", pid 0, jiffies 4294769856 (age 110.757s) hex dump (first 32 bytes): 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ 89 ab 75 5e d4 01 58 13 00 00 00 00 00 00 00 00 ..u^..X......... backtrace: [] kmemleak_alloc+0x4e/0xb0 [] kmem_cache_alloc+0xc8/0x210 [] skb_clone+0x49/0xb0 [] sctp_endpoint_bh_rcv+0x1d9/0x230 [sctp] [] sctp_inq_push+0x4c/0x70 [sctp] [] sctp_rcv+0x82e/0x9a0 [sctp] [] ip_local_deliver_finish+0xa8/0x210 [] nf_reinject+0xbf/0x180 [] nfqnl_recv_verdict+0x1d2/0x2b0 [nfnetlink_queue] [] nfnetlink_rcv_msg+0x14b/0x250 [nfnetlink] [] netlink_rcv_skb+0xa9/0xc0 [] nfnetlink_rcv+0x23f/0x408 [nfnetlink] [] netlink_unicast+0x168/0x250 [] netlink_sendmsg+0x2e1/0x3f0 [] sock_sendmsg+0x8b/0xc0 [] ___sys_sendmsg+0x369/0x380 What happens is that commit bbd0d59809f9 clones the skb containing the AUTH chunk in sctp_endpoint_bh_rcv() when having the edge case that an endpoint requires COOKIE-ECHO chunks to be authenticated: ---------- INIT[RANDOM; CHUNKS; HMAC-ALGO] ----------> <------- INIT-ACK[RANDOM; CHUNKS; HMAC-ALGO] --------- ------------------ AUTH; COOKIE-ECHO ----------------> <-------------------- COOKIE-ACK --------------------- When we enter sctp_sf_do_5_1D_ce() and before we actually get to the point where we process (and subsequently free) a non-NULL chunk->auth_chunk, we could hit the "goto nomem_init" path from an error condition and thus leave the cloned skb around w/o freeing it. The fix is to centrally free such clones in sctp_chunk_destroy() handler that is invoked from sctp_chunk_free() after all refs have dropped; and also move both kfree_skb(chunk->auth_chunk) there, so that chunk->auth_chunk is either NULL (since sctp_chunkify() allocs new chunks through kmem_cache_zalloc()) or non-NULL with a valid skb pointer. chunk->skb and chunk->auth_chunk are the only skbs in the sctp_chunk structure that need to be handeled. While at it, we should use consume_skb() for both. It is the same as dev_kfree_skb() but more appropriately named as we are not a device but a protocol. Also, this effectively replaces the kfree_skb() from both invocations into consume_skb(). Functions are the same only that kfree_skb() assumes that the frame was being dropped after a failure (e.g. for tools like drop monitor), usage of consume_skb() seems more appropriate in function sctp_chunk_destroy() though. Fixes: bbd0d59809f9 ("[SCTP]: Implement the receive and verification of AUTH chunk") Signed-off-by: Daniel Borkmann Cc: Vlad Yasevich Cc: Neil Horman Acked-by: Vlad Yasevich Acked-by: Neil Horman Signed-off-by: David S. Miller Signed-off-by: Greg Kroah-Hartman commit f3827de6d3209745e3d9823955e5857ebac62d3d Author: Nikolay Aleksandrov Date: Mon Mar 3 23:19:18 2014 +0100 net: fix for a race condition in the inet frag code [ Upstream commit 24b9bf43e93e0edd89072da51cf1fab95fc69dec ] I stumbled upon this very serious bug while hunting for another one, it's a very subtle race condition between inet_frag_evictor, inet_frag_intern and the IPv4/6 frag_queue and expire functions (basically the users of inet_frag_kill/inet_frag_put). What happens is that after a fragment has been added to the hash chain but before it's been added to the lru_list (inet_frag_lru_add) in inet_frag_intern, it may get deleted (either by an expired timer if the system load is high or the timer sufficiently low, or by the fraq_queue function for different reasons) before it's added to the lru_list, then after it gets added it's a matter of time for the evictor to get to a piece of memory which has been freed leading to a number of different bugs depending on what's left there. I've been able to trigger this on both IPv4 and IPv6 (which is normal as the frag code is the same), but it's been much more difficult to trigger on IPv4 due to the protocol differences about how fragments are treated. The setup I used to reproduce this is: 2 machines with 4 x 10G bonded in a RR bond, so the same flow can be seen on multiple cards at the same time. Then I used multiple instances of ping/ping6 to generate fragmented packets and flood the machines with them while running other processes to load the attacked machine. *It is very important to have the _same flow_ coming in on multiple CPUs concurrently. Usually the attacked machine would die in less than 30 minutes, if configured properly to have many evictor calls and timeouts it could happen in 10 minutes or so. An important point to make is that any caller (frag_queue or timer) of inet_frag_kill will remove both the timer refcount and the original/guarding refcount thus removing everything that's keeping the frag from being freed at the next inet_frag_put. All of this could happen before the frag was ever added to the LRU list, then it gets added and the evictor uses a freed fragment. An example for IPv6 would be if a fragment is being added and is at the stage of being inserted in the hash after the hash lock is released, but before inet_frag_lru_add executes (or is able to obtain the lru lock) another overlapping fragment for the same flow arrives at a different CPU which finds it in the hash, but since it's overlapping it drops it invoking inet_frag_kill and thus removing all guarding refcounts, and afterwards freeing it by invoking inet_frag_put which removes the last refcount added previously by inet_frag_find, then inet_frag_lru_add gets executed by inet_frag_intern and we have a freed fragment in the lru_list. The fix is simple, just move the lru_add under the hash chain locked region so when a removing function is called it'll have to wait for the fragment to be added to the lru_list, and then it'll remove it (it works because the hash chain removal is done before the lru_list one and there's no window between the two list adds when the frag can get dropped). With this fix applied I couldn't kill the same machine in 24 hours with the same setup. Fixes: 3ef0eb0db4bf ("net: frag, move LRU list maintenance outside of rwlock") CC: Florian Westphal CC: Jesper Dangaard Brouer CC: David S. Miller Signed-off-by: Nikolay Aleksandrov Acked-by: Jesper Dangaard Brouer Signed-off-by: David S. Miller Signed-off-by: Greg Kroah-Hartman commit e6496f503a0bbe5af1a6e9116cf6f3acbd426a6c Author: Paul Moore Date: Wed Mar 19 16:46:18 2014 -0400 selinux: correctly label /proc inodes in use before the policy is loaded commit f64410ec665479d7b4b77b7519e814253ed0f686 upstream. This patch is based on an earlier patch by Eric Paris, he describes the problem below: "If an inode is accessed before policy load it will get placed on a list of inodes to be initialized after policy load. After policy load we call inode_doinit() which calls inode_doinit_with_dentry() on all inodes accessed before policy load. In the case of inodes in procfs that means we'll end up at the bottom where it does: /* Default to the fs superblock SID. */ isec->sid = sbsec->sid; if ((sbsec->flags & SE_SBPROC) && !S_ISLNK(inode->i_mode)) { if (opt_dentry) { isec->sclass = inode_mode_to_security_class(...) rc = selinux_proc_get_sid(opt_dentry, isec->sclass, &sid); if (rc) goto out_unlock; isec->sid = sid; } } Since opt_dentry is null, we'll never call selinux_proc_get_sid() and will leave the inode labeled with the label on the superblock. I believe a fix would be to mimic the behavior of xattrs. Look for an alias of the inode. If it can't be found, just leave the inode uninitialized (and pick it up later) if it can be found, we should be able to call selinux_proc_get_sid() ..." On a system exhibiting this problem, you will notice a lot of files in /proc with the generic "proc_t" type (at least the ones that were accessed early in the boot), for example: # ls -Z /proc/sys/kernel/shmmax | awk '{ print $4 " " $5 }' system_u:object_r:proc_t:s0 /proc/sys/kernel/shmmax However, with this patch in place we see the expected result: # ls -Z /proc/sys/kernel/shmmax | awk '{ print $4 " " $5 }' system_u:object_r:sysctl_kernel_t:s0 /proc/sys/kernel/shmmax Cc: Eric Paris Signed-off-by: Paul Moore Acked-by: Eric Paris Signed-off-by: Greg Kroah-Hartman