summaryrefslogtreecommitdiff
path: root/drivers/infiniband
AgeCommit message (Collapse)AuthorFilesLines
2006-03-24IPoIB: P_Key change event handlingLeonid Arsh4-19/+44
This patch causes the network interface to respond to P_Key change events correctly. As a result, you'll see a child interface in the "RUNNING" state (netif_carrier_on()) only when the corresponding P_Key is configured by the SM. When SM removes a P_Key, the "RUNNING" state will be disabled for the corresponding network interface. To implement this, I added IB_EVENT_PKEY_CHANGE event handling. To prevent flushing the device before the device is open by the "delay open" mechanism, I added an additional device flag called IPOIB_FLAG_INITIALIZED. This also prevents the child network interface from trying to join to multicast groups until the PKEY is configured. We used to get error messages like: ib0.f2f2: couldn't attach QP to multicast group ff12:401b:f2f2:0:0:0:ffff:ffff in this case. To fix this, I just check IPOIB_FLAG_OPER_UP flag in ipoib_set_mcast_list(). Signed-off-by: Leonid Arsh <leonida@voltaire.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>
2006-03-24IB/mthca: Fix modify QP error pathRoland Dreier1-10/+11
If the call to mthca_MODIFY_QP() failed, then mthca_modify_qp() would still do some things it shouldn't, such as store away attributes for special QPs. Fix this, and simplify the code, by simply jumping to the exit path if mthca_MODIFY_QP() fails. Signed-off-by: Roland Dreier <rolandd@cisco.com>
2006-03-24IPoIB: Fix network interface "RUNNING" statusLeonid Arsh1-2/+3
With the current IPoIB driver, the status of network interfaces stays "RUNNING" even if the link goes down (for example because a cable is unplugged). Fix this by flushing the IPoIB interface when the link goes down. Signed-off-by: Leonid Arsh <leonida@voltaire.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>
2006-03-24IB/mthca: Fix indentationRoland Dreier1-2/+2
Fix some whitespace damage (indenting with spaces) that snuck in. Signed-off-by: Roland Dreier <rolandd@cisco.com>
2006-03-24IB/mthca: Fix uninitialized variable in mthca_alloc_qp()Jack Morgenstein1-4/+5
mthca_alloc_sqp() by mthca_set_qp_size() need to set qp->transport before calling mthca_set_qp_size(), since the value is used there. Signed-off-by: Jack Morgenstein <jackm@mellanox.co.il> Signed-off-by: Roland Dreier <rolandd@cisco.com>
2006-03-24IB/mthca: Check SRQ limit in modify SRQ operationJack Morgenstein1-0/+2
When setting the shared receive queue (SRQ) watermark in a modify SRQ operation, make sure that the supplied value is not larger than the full size of the SRQ. Signed-off-by: Jack Morgenstein <jackm@mellanox.co.il> Signed-off-by: Roland Dreier <rolandd@cisco.com>
2006-03-24IB/mthca: Check that SRQ WQE size does not exceed device's max valueJack Morgenstein1-0/+4
Guarantee the calculated work queue entry size does not exceed the max allowable WQE size when creating an SRQ. This is a problem with Arbel in Tavor-compatibility mode because the current WQE size computation method rounds up to next power of 2. Signed-off-by: Jack Morgenstein <jackm@mellanox.co.il> Signed-off-by: Roland Dreier <rolandd@cisco.com>
2006-03-24IB/mthca: Check that sgid_index and path_mtu are valid in modify_qpDotan Barak1-4/+23
Add a check that the modify QP parameters sgid_index and path_mtu are valid, since they might come from userspace. Signed-off-by: Dotan Barak <dotanb@mellanox.co.il> Signed-off-by: Roland Dreier <rolandd@cisco.com>
2006-03-24IB/srp: Use a fake scatterlist for non-SG SCSI commandsRoland Dreier2-77/+75
Since the SCSI midlayer is moving towards entirely getting rid of commands with use_sg == 0, we should treat this case as an exception. Therefore, change the IB SRP initiator to create a fake scatterlist for these commands with sg_init_one(). This simplifies the flow of DMA mapping and unmapping, since SRP can just use dma_map_sg() and dma_unmap_sg() unconditionally, rather than having to choose between the dma_{map,unmap}_sg() and dma_{map,unmap}_single() variants. Signed-off-by: Roland Dreier <rolandd@cisco.com>
2006-03-24IPoIB: Pass correct pointer when flushing child interfacesLeonid Arsh1-1/+1
ipoib_ib_dev_flush() should get passed cpriv->dev, not &cpriv->dev. Signed-off-by: Leonid Arsh <leonida@voltaire.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>
2006-03-21Merge master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6Linus Torvalds1-15/+1
* master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6: (235 commits) [NETFILTER]: Add H.323 conntrack/NAT helper [TG3]: Don't mark tg3_test_registers() as returning const. [IPV6]: Cleanups for net/ipv6/addrconf.c (kzalloc, early exit) v2 [IPV6]: Nearly complete kzalloc cleanup for net/ipv6 [IPV6]: Cleanup of net/ipv6/reassambly.c [BRIDGE]: Remove duplicate const from is_link_local() argument type. [DECNET]: net/decnet/dn_route.c: fix inconsequent NULL checking [TG3]: make drivers/net/tg3.c:tg3_request_irq() static [BRIDGE]: use LLC to send STP [LLC]: llc_mac_hdr_init const arguments [BRIDGE]: allow show/store of group multicast address [BRIDGE]: use llc for receiving STP packets [BRIDGE]: stp timer to jiffies cleanup [BRIDGE]: forwarding remove unneeded preempt and bh diasables [BRIDGE]: netfilter inline cleanup [BRIDGE]: netfilter VLAN macro cleanup [BRIDGE]: netfilter dont use __constant_htons [BRIDGE]: netfilter whitespace [BRIDGE]: optimize frame pass up [BRIDGE]: use kzalloc ...
2006-03-20[INFINIBAND] ipoib: Remove leftover use of neigh_ops->destructorArnaldo Carvalho de Melo1-1/+0
Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2006-03-20[NET]: Move destructor from neigh->ops to neigh_paramsMichael S. Tsirkin1-15/+1
struct neigh_ops currently has a destructor field, which no in-kernel drivers outside of infiniband use. The infiniband/ulp/ipoib in-tree driver stashes some info in the neighbour structure (the results of the second-stage lookup from ARP results to real link-level path), and it uses neigh->ops->destructor to get a callback so it can clean up this extra info when a neighbour is freed. We've run into problems with this: since the destructor is in an ops field that is shared between neighbours that may belong to different net devices, there's no way to set/clear it safely. The following patch moves this field to neigh_parms where it can be safely set, together with its twin neigh_setup. Two additional patches in the patch series update ipoib to use this new interface. Signed-off-by: Michael S. Tsirkin <mst@mellanox.co.il> Signed-off-by: Roland Dreier <rolandd@cisco.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2006-03-20IB/mthca: Query SRQ srq_limit fixesEli Cohen1-4/+8
Fix endianness handling of srq_limit: it is big-endian in the context structure, so we need to swab it before returning it. Also add support for srq_limit query for Tavor (non-MemFree) HCAs. Signed-off-by: Eli Cohen <eli@mellanox.co.il> Signed-off-by: Roland Dreier <rolandd@cisco.com>
2006-03-20IPoIB: Get rid of useless test of queue lengthRoland Dreier1-6/+1
In neigh_add_path(), the queue of delayed packets can never be full, because the queue is always freshly created and cannot be found by any other code path. In fact, the test of the queue length is worse than useless: if somehow the test ever triggered and path_rec_start() also failed, then dev_kfree_skb_any() will be called twice on the same skb. Fix this by deleting the useless test. Pointed out by Michael S. Tsirkin <mst@mellanox.co.il>. Signed-off-by: Roland Dreier <rolandd@cisco.com>
2006-03-20IB/mthca: Correct reported SRQ size in MemFree case.Dotan Barak1-2/+2
MemFree devices need to reserve one shared receive queue (SRQ) work request for internal use, so the capacity returned from the create_srq and query_srq methods should be srq->max - 1. Signed-off-by: Dotan Barak <dotanb@mellanox.co.il> Signed-off-by: Roland Dreier <rolandd@cisco.com>
2006-03-20IB/mad: Fix oopsable race on device removalMichael S. Tsirkin1-7/+14
Fix an oopsable race debugged by Eli Cohen <eli@mellanox.co.il>: After removing the port from port_list, ib_mad_port_close flushes port_priv->wq before destroying the special QPs. This means that a completion event could arrive, and queue a new work in this work queue after flush. This patch also removes an unnecessary flush_workqueue(): destroy_workqueue() already includes a flush. Signed-off-by: Michael S. Tsirkin <mst@mellanox.co.il> Signed-off-by: Roland Dreier <rolandd@cisco.com>
2006-03-20IB/srp: Coverity fix to srp_parse_options()Roland Dreier1-0/+1
Fix leak found by Coverity: in the SRP_OPT_DGID case, srp_parse_options() didn't free the result of match_strdup(). Signed-off-by: Roland Dreier <rolandd@cisco.com>
2006-03-20IB/mthca: Coverity fix to mthca_init_eq_table()Roland Dreier1-1/+1
Fix bug found by coverity: the loop body never executed, because it was doing for (i = 0; i < MTHCA_EQ_CMD; ++i), but MTHCA_EQ_CMD is 0. The correct loop bound is MTHCA_NUM_EQ, to loop over all EQs. Signed-off-by: Roland Dreier <rolandd@cisco.com>
2006-03-20IB: Coverity fixes to sysfs.cRoland Dreier1-2/+4
Fix two bugs found by coverity: - Memory leak in error path of alloc_group_attrs() - Fencepost error in state_show(): the test should be < ARRAY_SIZE(), not <= ARRAY_SIZE(). Signed-off-by: Roland Dreier <rolandd@cisco.com>
2006-03-20IPoIB: Move ipoib_ib_dev_flush() to ipoib workqueueJack Morgenstein4-7/+14
Move ipoib_ib_dev_flush() to ipoib's workqueue. This keeps it ordered with respect to other work scheduled by the ipoib driver. This fixes problems with races, for example: - ipoib_ib_dev_flush() has started running because of an IB event - user does ifconfig ib0 down - ipoib_mcast_stop_thread() gets called twice and waits for the same completion twice Signed-off-by: Jack Morgenstein <jackm@mellanox.co.il> Signed-off-by: Michael S. Tsirkin <mst@mellanox.co.il> Signed-off-by: Roland Dreier <rolandd@cisco.com>
2006-03-20IPoIB: Fix build now that neighbour destructor is in neigh_paramsRoland Dreier1-1/+0
Fix the IPoIB build (which is broken in net-2.6.17 because of my screw-up, which left out this chunk in ipoib_multicast.c). The neighbour destructor is now in neigh_params, so we don't need to clear it in the ops structure. Signed-off-by: Roland Dreier <rolandd@cisco.com>
2006-03-20IB/uverbs: Use correct alt_pkey_index in modify QPAmi Perlmutter1-1/+1
The old code incorrectly used the primary P_Key index as the alternate index too. Signed-off-by: Ami Perlmutter <amip@mellanox.co.il> Signed-off-by: Roland Dreier <rolandd@cisco.com>
2006-03-20IB/umad: Add support for large RMPP transfersJack Morgenstein4-209/+346
Add support for sending and receiving large RMPP transfers. The old code supports transfers only as large as a single contiguous kernel memory allocation. This patch uses linked list of memory buffers when sending and receiving data to avoid needing contiguous pages for larger transfers. Receive side: copy the arriving MADs in chunks instead of coalescing to one large buffer in kernel space. Send side: split a multipacket MAD buffer to a list of segments, (multipacket_list) and send these using a gather list of size 2. Also, save pointer to last sent segment, and retrieve requested segments by walking list starting at last sent segment. Finally, save pointer to last-acked segment. When retrying, retrieve segments for resending relative to this pointer. When updating last ack, start at this pointer. Signed-off-by: Jack Morgenstein <jackm@mellanox.co.il> Signed-off-by: Sean Hefty <sean.hefty@intel.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>
2006-03-20IB/srp: Add SCSI host attributes to show target portRoland Dreier1-1/+83
Add SCSI host attributes in sysfs that show the ID extension, IOC GUID, service ID, P_Key and destination GID for each target port that the SRP initiator connects to. Signed-off-by: Roland Dreier <rolandd@cisco.com>
2006-03-20IB/cm: Check cm_id state before handling a REPSean Hefty1-18/+24
Move checking the state of a cm_id before modifying it when handling a REP. This fixes a bug seen under MPI scale-up testing, where a NULL timewait_info pointer is dereferenced if a request times out before a REP is received. Signed-off-by: Sean Hefty <sean.hefty@intel.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>
2006-03-20IB/mthca: Update firmware versionsRoland Dreier1-3/+3
Update known firmware versions in driver's table to the latest releases. Signed-off-by: Roland Dreier <rolandd@cisco.com>
2006-03-20IB/mthca: Optimize large messages on Sinai HCAsEli Cohen5-17/+51
Sinai (one-port PCI Express) HCAs get improved throughput for messages bigger than 80 KB in DDR mode if memory keys are formatted in a specific way. The enhancement only works if the memory key table is smaller than 2^24 entries. For larger tables, the enhancement is off and a warning is printed (to avoid silent performance loss). Signed-off-by: Eli Cohen <eli@mellanox.co.il> Signed-off-by: Michael Tsirkin <mst@mellanox.co.il> Signed-off-by: Roland Dreier <rolandd@cisco.com>
2006-03-20IB/uverbs: Fix query QP return of sq_sig_allDotan Barak1-1/+1
The old code didn't convert from the kernel's enum correctly. Signed-off-by: Dotan Barak <dotanb@mellanox.co.il> Signed-off-by: Roland Dreier <rolandd@cisco.com>
2006-03-20IB: Fix modify QP checking of "current QP state" attributeDotan Barak1-4/+4
According to the IB spec version 1.2, section 11.2.4.2, the current table has a couple of mistakes where it allows the current QP state (IB_QP_CUR_STATE) attribute. For the transitions: RTS -> RTS: IB_QP_CUR_STATE should be allowed for all transports SQD -> SQD: IB_QP_CUR_STATE should never be allowed Signed-off-by: Dotan Barak <dotanb@mellanox.co.il> Signed-off-by: Roland Dreier <rolandd@cisco.com>
2006-03-20IPoIB: Fix multicast race between canceling and completingMichael S. Tsirkin1-3/+12
ipoib_mcast_stop_thread currently tests mcast->query and if it is NULL, does not perform wait_for_completion on the mcast and frees the mcast object directly. However, since both operations are done without locking, it is possible that ipoib_mcast_join_complete is in progress on this mcast object and has set mcast->query to NULL already. Solve this by: - taking priv->lock before we change mcast->query in ipoib_mcast_join_complete, and keeping it until we no longer need the mcast object - taking priv->lock around mcast->query test in ipoib_mcast_stop_thread Signed-off-by: Michael S. Tsirkin <mst@mellanox.co.il> Signed-off-by: Roland Dreier <rolandd@cisco.com>
2006-03-20IPoIB: Clean up if posting receives failsEli Cohen1-0/+1
If posting receives in ipoib_ib_dev_open() fails, call ipoib_ib_dev_stop() to move the device's QP back to the RESET state so that we can try again later. Signed-off-by: Eli Cohen <eli@mellanox.co.il> Signed-off-by: Michael S. Tsirkin <mst@mellanox.co.il> Signed-off-by: Roland Dreier <rolandd@cisco.com>
2006-03-20IB/mthca: Use an enum for HCA page sizeIshai Rabinovitz4-30/+37
Use a named enum for the HCA's internal page size, rather than having magic values of 4096 and shifts by 12 all over the code. Also, fix one minor bug in EQ handling: only one HCA page is mapped to the HCA during initialization, but a full kernel page is unmapped during cleanup. This might cause problems when PAGE_SIZE != 4096. Signed-off-by: Ishai Rabinovitz <ishai@mellanox.co.il> Signed-off-by: Roland Dreier <rolandd@cisco.com>
2006-03-20IB/mthca: Check alternate P_Key index when setting alternate pathDotan Barak1-2/+8
Check that the alternate P_Key index is in range when setting the alternate path for a QP. Also make a cosmetic touch up to the debug message printed when the main P_Key index is out of range. Signed-off-by: Dotan Barak <dotanb@mellanox.co.il> Signed-off-by: Roland Dreier <rolandd@cisco.com>
2006-03-20IB/mthca: Add support for send work request fence flagDotan Barak1-2/+6
Add support for IB_SEND_FENCE flag in post_send methods. Signed-off-by: Dotan Barak <dotanb@mellanox.co.il> Signed-off-by: Michael S. Tsirkin <mst@mellanox.co.il> Signed-off-by: Roland Dreier <rolandd@cisco.com>
2006-03-20IPoIB: Close race in setting mcast->ahEli Cohen1-2/+7
ipoib_mcast_send() tests mcast->ah twice. If this value is changed between these two points, we leak an skb. However, ipoib_mcast_join_finish() sets mcast->ah with no locking, so it could race against ipoib_mcast_send(). As a solution, take priv->lock around assignment to mcast->ah thus making sure ipoib_mcast_send() (which also takes priv->lock) is not in flight. Signed-off-by: Eli Cohen <eli@mellanox.co.il> Signed-off-by: Michael S. Tsirkin <mst@mellanox.co.il> Signed-off-by: Roland Dreier <rolandd@cisco.com>
2006-03-20IB/mthca: Implement query_ah methodJack Morgenstein3-0/+33
Implement query_ah (except for AVs which are in HCA memory). This is needed to implement RMPP duplicate session detection on sending side (extraction of DGID/DLID and GRH flag from address handle). Signed-off-by: Jack Morgenstein <jackm@mellanox.co.il> Signed-off-by: Roland Dreier <rolandd@cisco.com>
2006-03-20IB/mthca: Write FW commands through doorbell pageEli Cohen2-23/+136
This patch is checks whether the HCA supports posting FW commands through a doorbell page (user access region 0, or "UAR0"). If this is supported, the driver maps UAR0 and uses it for FW commands. This can be controlled by the value of a writable module parameter fw_cmd_doorbell. When the parameter is 0, the commands are posted through HCR using the old method; otherwise if HCA is capable commands go through UAR0. This use of UAR0 to post commands eliminates the need for polling the "go" bit prior to posting a new command. Since reading from a PCI device is much more expensive then issuing a posted write, it is expected that issuing FW commands this way will provide better CPU utilization. Signed-off-by: Eli Cohen <eli@mellanox.co.il> Signed-off-by: Roland Dreier <rolandd@cisco.com>
2006-03-20IB/uverbs: Return actual capacity from create SRQ operationDotan Barak1-0/+2
Pass actual capacity of created SRQ back to userspace, so that userspace can report accurate capacities. This requires an ABI bump, to change struct ib_uverbs_create_srq_resp. Signed-off-by: Dotan Barak <dotanb@mellanox.co.il> Signed-off-by: Roland Dreier <rolandd@cisco.com>
2006-03-20IB/mthca: Return actual capacity from create_srqDotan Barak1-0/+3
Have mthca's create_srq method return the actual capacity of the SRQ that gets created. Also update comments in <rdma/ib_verbs.h> to clarify that this is what is expected from ib_create_srq(). Signed-off-by: Dotan Barak <dotanb@mellanox.co.il> Signed-off-by: Roland Dreier <rolandd@cisco.com>
2006-03-20IPoIB: clarify to_ipoib_neigh()Michael S. Tsirkin1-2/+8
Cosmetic change: make alignment explicit in to_ipoib_neigh. Signed-off-by: Michael S. Tsirkin <mst@mellanox.co.il> Signed-off-by: Roland Dreier <rolandd@cisco.com>
2006-03-20IB/mthca: Bump driver version and release dateRoland Dreier1-2/+2
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2006-03-20IB/mthca: Support for query QP and SRQEli Cohen6-1/+184
Implement the query_qp and query_srq methods in mthca. Signed-off-by: Eli Cohen <eli@mellanox.co.il> Signed-off-by: Roland Dreier <rolandd@cisco.com>
2006-03-20IB/uverbs: Support for query SRQ from userspaceDotan Barak3-0/+46
Add support to uverbs to handle querying userspace SRQs (shared receive queues), including adding an ABI for marshalling requests and responses. The kernel midlayer already has the underlying ib_query_srq() function. Signed-off-by: Dotan Barak <dotanb@mellanox.co.il> Signed-off-by: Roland Dreier <rolandd@cisco.com>
2006-03-20IB/uverbs: Support for query QP from userspaceDotan Barak3-0/+102
Add support to uverbs to handle querying userspace QPs (queue pairs), including adding an ABI for marshalling requests and responses. The kernel midlayer already has the underlying ib_query_qp() function. Signed-off-by: Dotan Barak <dotanb@mellanox.co.il> Signed-off-by: Roland Dreier <rolandd@cisco.com>
2006-03-20IB: Whitespace cleanupsRoland Dreier3-9/+7
Remove trailing whitespace and fix indentation that with spaces instead of tabs. Signed-off-by: Roland Dreier <rolandd@cisco.com>
2006-03-20IB/mthca: Convert to use ib_modify_qp_is_ok()Roland Dreier3-290/+76
Use ib_modify_qp_is_ok() in mthca, and delete the big table of attributes for queue pair state transitions. Signed-off-by: Roland Dreier <rolandd@cisco.com>
2006-03-20IB: Add ib_modify_qp_is_ok() library functionRoland Dreier1-0/+252
The in-kernel mthca driver contains a table of which attributes are valid for each queue pair state transition. It turns out that both other IB drivers -- ipath and ehca -- which are being prepared for merging have copied this table, errors and all. To forestall this code duplication, move this table and the code to check parameters against it into a midlayer library function, ib_modify_qp_is_ok(). Signed-off-by: Roland Dreier <rolandd@cisco.com>
2006-03-20IB/mthca: Generate SQ drained events when requestedRoland Dreier2-2/+15
Add low-level driver support to ib_mthca so that consumers can request a "send queue drained" event be generated when a transiton to the SQD state completes. Signed-off-by: Roland Dreier <rolandd@cisco.com>
2006-03-20IB/mad: Simplify SMI by eliminating smi_check_local_dr_smp()Ralph Campbell3-27/+6
The call to ib_get_agent_port() shouldn't be possible to fail when smi_check_local_dr_smp() is called from ib_mad_recv_done_handler(). When it is called from handle_outgoing_dr_smp(), the device and port_num come from mad_agent_priv so I assume the call to ib_get_agent_port() shouldn't fail either. In either case, smi_check_local_smp() only uses the mad_agent pointer to check that mad_agent->device->process_mad is not NULL. The device pointer would have to be the same as the one passed to smi_check_local_dr_smp() since that pointer is used later instead of the one checked in smi_check_local_smp(). Signed-off-by: Hal Rosenstock <halr@voltaire.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>