summaryrefslogtreecommitdiffstats
path: root/drivers/misc/habanalabs
Commit message (Collapse)AuthorAgeFilesLines
* habanalabs: increase h/w timer when checking idleOmer Shpigelman2020-06-241-0/+2
| | | | | | | | | | | In GAUDI the current timer value for the hardware to check if it is in IDLE state is too low. As a result, there are occasions where the H/W wrongly reports it is not IDLE. The driver checks that before submitting work on behalf of the driver during initialization, so a false report might cause the driver to fail during device initialization. Signed-off-by: Omer Shpigelman <oshpigelman@habana.ai> Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
* habanalabs: Correct handling when failing to enqueue CBOfir Bitton2020-06-241-0/+13
| | | | | | | | | | The fence release flow is different if the CS was never submitted. In that case, we don't have an hw_sob object attached that we need to "put". While if the CS was aborted, we do need to "put" the hw_sob. Signed-off-by: Ofir Bitton <obitton@habana.ai> Reviewed-by: Oded Gabbay <oded.gabbay@gmail.com> Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
* habanalabs: increase GAUDI QMAN ARB WDT timeoutOded Gabbay2020-06-241-1/+1
| | | | | | | | The current timeout is too low for some of the workloads and we see false errors as a result. Reviewed-by: Tomer Tayar <ttayar@habana.ai> Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
* habanalabs: rename mmu_write() to mmu_asid_va_write()Oded Gabbay2020-06-241-2/+2
| | | | | | | | | The function name conflicts with a static inline function in arch/m68k/include/asm/mcfmmu.h Reported-by: kernel test robot <lkp@intel.com> Reviewed-by: Tomer Tayar <ttayar@habana.ai> Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
* habanalabs: use PI in MMU cache invalidationOmer Shpigelman2020-06-242-0/+11
| | | | | | | | | The PS flow for MMU cache invalidation caused timeouts in stress tests. Use PS + PI flow so no timeouts should happen whatsoever. Signed-off-by: Omer Shpigelman <oshpigelman@habana.ai> Reviewed-by: Oded Gabbay <oded.gabbay@gmail.com> Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
* habanalabs: block scalar load_and_exe on external queueOded Gabbay2020-06-242-1/+27
| | | | | | | | | In Gaudi, the user can't execute scalar load_and_exe on external queue because it can be a security hole. The driver doesn't parse the commands being loaded and it can be msg_prot, which the user isn't allowed to use. Reviewed-by: Tomer Tayar <ttayar@habana.ai> Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
* habanalabs: correctly cast u64 to void*Oded Gabbay2020-06-011-1/+1
| | | | | | | | | | Use the u64_to_user_ptr(x) kernel macro to correctly cast u64 to void* Reported-by: kbuild test robot <lkp@intel.com> Reviewed-by: Omer Shpigelman <oshpigelman@habana.ai> Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com> Link: https://lore.kernel.org/r/20200601065648.8775-2-oded.gabbay@gmail.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
* habanalabs: initialize variable to default valueTomer Tayar2020-06-011-1/+1
| | | | | | | | | | | Fix the following smatch error in unmap_device_va(): error: uninitialized symbol 'rc'. Signed-off-by: Tomer Tayar <ttayar@habana.ai> Reviewed-by: Oded Gabbay <oded.gabbay@gmail.com> Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com> Link: https://lore.kernel.org/r/20200601065648.8775-1-oded.gabbay@gmail.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
* habanalabs: handle MMU cache invalidation timeoutOmer Shpigelman2020-05-254-38/+75
| | | | | | | | | | | | MMU cache invalidation timeout indicates that the device is unstable and therefore unusable. Hence in such case do hard reset and return an error to the user if was called from ioctl. In addition, change the print to error level and rephrase its text. Signed-off-by: Omer Shpigelman <oshpigelman@habana.ai> Reviewed-by: Oded Gabbay <oded.gabbay@gmail.com> Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
* habanalabs: don't allow hard reset with open processesOmer Shpigelman2020-05-251-6/+11
| | | | | | | | | | | | | | | | | When the MMU is heavily used by the engines, unmapping might take a lot of time due to a full MMU cache invalidation done as part of the unmap flow. Hence we might not be able to kill all open processes before going to hard reset the device, as it involves unmapping of all user memory. In case of a failure in killing all open processes, we should stop the hard reset flow as it might lead to a kernel crash - one thread (killing of a process) is updating MMU structures that other thread (hard reset) is freeing. Stopping a hard reset flow leaves the device as nonoperational and the user can then initiate a hard reset via sysfs to reinitialize the device. Signed-off-by: Omer Shpigelman <oshpigelman@habana.ai> Reviewed-by: Oded Gabbay <oded.gabbay@gmail.com> Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
* habanalabs: GAUDI does not support soft-resetOded Gabbay2020-05-255-17/+35
| | | | | | | | | | | GAUDI does not support soft-reset as it leaves the NIC ports in an awkward state, where their QMANs were reset but the NIC itself is still working. In addition, there is not much sense in doing soft-reset when training is done on multiple GAUDIs. Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com> Reviewed-by: Tomer Tayar <ttayar@habana.ai>
* habanalabs: add print for soft reset due to eventOmer Shpigelman2020-05-251-2/+10
| | | | | | | | Print the event name that caused the soft reset. Signed-off-by: Omer Shpigelman <oshpigelman@habana.ai> Reviewed-by: Oded Gabbay <oded.gabbay@gmail.com> Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
* habanalabs: improve MMU cache invalidation codeOmer Shpigelman2020-05-251-2/+4
| | | | | | | | | A new sequence is introduced to invalidate the MMU cache in order to avoid timeouts. Signed-off-by: Omer Shpigelman <oshpigelman@habana.ai> Reviewed-by: Oded Gabbay <oded.gabbay@gmail.com> Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
* habanalabs: don't set default fence_ops->waitDaniel Vetter2020-05-251-1/+0
| | | | | | | | | | | | | | | | | | | | | It's the default. Also so much for "we're not going to tell the graphics people how to review their code", dma_fence is a pretty core piece of gpu driver infrastructure. And it's very much uapi relevant, including piles of corresponding userspace protocols and libraries for how to pass these around. Would be great if habanalabs would not use this (from a quick look it's not needed at all), since open source the userspace and playing by the usual rules isn't on the table. If that's not possible (because it's actually using the uapi part of dma_fence to interact with gpu drivers) then we have exactly what everyone promised we'd want to avoid. Signed-off-by: Daniel Vetter <daniel.vetter@intel.com> Reviewed-by: Oded Gabbay <oded.gabbay@gmail.com> Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
* habanalabs: update patched_cb_size for Wreg32Rachel Stahl2020-05-191-0/+1
| | | | | | | | | The patch_cb_size is not updated for Wreg32 in its validate function, so updated in goya_validate_cb. Signed-off-by: Rachel Stahl <rstahl@habana.ai> Reviewed-by: Oded Gabbay <oded.gabbay@gmail.com> Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
* habanalabs: move event handling to common firmware fileOfir Bitton2020-05-196-1362/+801
| | | | | | | | | | | | | | Instead of writing similar event handling code for each ASIC, move the code to the common firmware file. This code will be used for GAUDI and all future ASICs. In addition, add two new fields to the auto-generated events file: valid and description. This will save the need to manually write the events description in the source code and simplify the code. Signed-off-by: Ofir Bitton <obitton@habana.ai> Reviewed-by: Oded Gabbay <oded.gabbay@gmail.com> Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
* habanalabs: enable gaudi code in driverOded Gabbay2020-05-193-5/+7
| | | | | | | Enable the GAUDI ASIC code in the pci probe callback of the driver so the driver will handle GAUDI ASICs. Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
* habanalabs: add gaudi profiler moduleOmer Shpigelman2020-05-194-3/+890
| | | | | | | | | | | Add the GAUDI code to initialize the ASIC's profiler. The profile receives its initialization values from the user, same as in Goya, but the code to initialize is in the driver because the configuration space of the device is not directly exposed to the user. Signed-off-by: Omer Shpigelman <oshpigelman@habana.ai> Reviewed-by: Oded Gabbay <oded.gabbay@gmail.com> Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
* habanalabs: add gaudi security moduleOmer Shpigelman2020-05-194-1/+9094
| | | | | | | | | | | | | | | | | | Add the code to initialize the security module of GAUDI. Similar to Goya, we have two dedicated mechanisms for security: Range Registers and Protection bits. Those mechanisms protect sensitive memory and configuration areas inside the device. In addition, in Gaudi we moved to a 3-level security scheme, where the F/W runs with the highest security level (Privileged), the driver runs with a less secured level (Secured) and the user is neither privileged nor secured. The security module in the driver configures the Secured parts so the user won't be able to access them. The Privileged parts are configured by the F/W. Signed-off-by: Omer Shpigelman <oshpigelman@habana.ai> Reviewed-by: Oded Gabbay <oded.gabbay@gmail.com> Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
* habanalabs: add hwmgr module for gaudiOded Gabbay2020-05-194-4/+130
| | | | | | | | | | | The hwmgr module is responsible for messages sent to GAUDI F/W that are not common to all habanalabs ASICs. In GAUDI, we provide the user a simplified mode of controlling the ASIC clock frequency. Instead of three different clocks, we present a single clock property that the user can configure via sysfs. Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
* habanalabs: add gaudi asic-dependent codeOded Gabbay2020-05-197-1/+8335
| | | | | | | | | | | Add the ASIC-dependent code for GAUDI. Supply (almost) all of the function callbacks that the driver's common code need to initialize, finalize and submit workloads to the GAUDI ASIC. It also contains the code to initialize the F/W of the GAUDI ASIC and to receive events from the F/W. Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
* habanalabs: add gaudi asic registers header filesOded Gabbay2020-05-1991-1/+71211
| | | | | | | | | | | | Add the relevant GAUDI ASIC registers header files. These files are generated automatically from a tool maintained by the VLSI engineers. There are more files which are not upstreamed because only very few defines from those files are used in the driver. For those files, we copied the relevant defines into gaudi_regs.h and gaudi_masks.h, to reduce the size of this patch. Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
* habanalabs: get card type, location from F/WOmer Shpigelman2020-05-192-3/+21
| | | | | | | | | | | | | For Gaudi the driver gets two new additional properties from the F/W: 1. The card's type - PCI or PMC 2. The card's location in the Gaudi's box (relevant only for PMC). The card's location is also passed to the user in the HW IP info structure as it needs this property for establishing communication between Gaudis. Signed-off-by: Omer Shpigelman <oshpigelman@habana.ai> Reviewed-by: Oded Gabbay <oded.gabbay@gmail.com> Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
* habanalabs: support clock gating enable/disableOded Gabbay2020-05-195-0/+84
| | | | | | | | | | | | | | | | | In Gaudi there is a feature of clock gating certain engines. Therefore, add this property to the device structure. In addition, due to a limitation of this feature, the driver needs to dynamically enable or disable this feature during run-time. Therefore, add ASIC interface functions to enable/disable this function from the common code. Moreover, this feature must be turned off when the user wishes to debug the ASIC by reading/writing registers and/or memory through the driver's debugfs. Therefore, add an option to enable/disable clock gating via the debugfs interface. Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
* habanalabs: set PM profile to auto only for goyaOded Gabbay2020-05-191-1/+4
| | | | | | | | | For Gaudi, the driver doesn't change the PM profile automatically due to device-controlled PM capabilities. Therefore, set the PM profile to auto only for Goya so the driver's code to automatically change the profile won't run on Gaudi. Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
* habanalabs: add dedicated define for hard resetOmer Shpigelman2020-05-192-2/+5
| | | | | | | | | | | Gaudi requires longer waiting during reset due to closing of network ports. Add this explanation to the relevant comment in the code and add a dedicated define for this reset timeout period, instead of multiplying another define. Signed-off-by: Omer Shpigelman <oshpigelman@habana.ai> Reviewed-by: Oded Gabbay <oded.gabbay@gmail.com> Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
* habanalabs: check if CoreSight is supportedOmer Shpigelman2020-05-192-0/+3
| | | | | | | | | Coresight is not supported on simulator, therefore add a boolean for checking that (currently used by un-upstreamed code). Signed-off-by: Omer Shpigelman <oshpigelman@habana.ai> Reviewed-by: Oded Gabbay <oded.gabbay@gmail.com> Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
* habanalabs: add signal/wait to CS IOCTL operationsOmer Shpigelman2020-05-193-28/+323
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Add the following two operations to the CS IOCTL: Signal: The signal operation is basically a command submission, that is created by the driver upon user request. It will be implemented using a dedicated PQE that will increment a specific SOB. There will be a new flag: HL_CS_FLAGS_SIGNAL. When the user set this flag in the CS IOCTL structure, the driver will execute a dedicated code path that will prepare this special PQE and submit it. The user only needs to provide a queue index on which to put the signal. Wait: The wait operation is also a command submission that is created by the driver upon user request. It will be implemented using a dedicated PQE that will contain packets of "ARM a monitor" + FENCE packet. There will be a new flag: HL_CS_FLAGS_WAIT. When the user set this flag in the CS structure, the driver will execute a dedicated code path that will prepare this special PQE and submit it. The user needs to provide the following parameters: 1. queue ID 2. an array of signal_seq numbers and the number of signals to wait on (the length of signal_seq_arr). The IOCTL will return the CS sequence number of the wait it put on the queue ID. Currently, the code supports signal_seq_nr==1. But this API definition will allow us to put a single PQE that waits on multiple signals. To correctly configure the monitor and fence, the driver will need to retrieve the specified signal CS object that contains the relevant SOB and its expected value. In case the signal CS has already been completed, there is no point of adding a wait operation. In this case, the driver will return to the user *without* putting anything on the PQ. The return code should reflect to the user that the signal was completed, as we won't return a CS sequence number for this wait. Signed-off-by: Omer Shpigelman <oshpigelman@habana.ai> Reviewed-by: Oded Gabbay <oded.gabbay@gmail.com> Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
* habanalabs: handle the h/w sync objectOmer Shpigelman2020-05-193-27/+188
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Define a structure representing the h/w sync object (SOB). a SOB can contain up to 2^15 values. Each signal CS will increment the SOB by 1, so after some time we will reach the maximum number the SOB can represent. When that happens, the driver needs to move to a different SOB for the signal operation. A SOB can be in 1 of 4 states: 1. Working state with value < 2^15 2. We reached a value of 2^15, but the signal operations weren't completed yet OR there are pending waits on this signal. For the next submission, the driver will move to another SOB. 3. ALL the signal operations on the SOB have finished AND there are no more pending waits on the SOB AND we reached a value of 2^15 (This basically means the refcnt of the SOB is 0 - see explanation below). When that happens, the driver can clear the SOB by simply doing WREG32 0 to it and set the refcnt back to 1. 4. The SOB is cleared and can be used next time by the driver when it needs to reuse an SOB. Per SOB, the driver will maintain a single refcnt, that will be initialized to 1. When a signal or wait operation on this SOB is submitted to the PQ, the refcnt will be incremented. When a signal or wait operation on this SOB completes, the refcnt will be decremented. After the submission of the signal operation that increments the SOB to a value of 2^15, the refcnt is also decremented. Signed-off-by: Omer Shpigelman <oshpigelman@habana.ai> Reviewed-by: Oded Gabbay <oded.gabbay@gmail.com> Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
* habanalabs: define ASIC-dependent interface for signal/waitOmer Shpigelman2020-05-192-0/+58
| | | | | | | | | | | | | | This feature requires handling h/w resources which are a bit different from one ASIC to the other. Therefore, we need to define a set of interfaces the ASIC code provides to the common code to signal, wait, reset sync object and to reset and init a queue. As this feature is not supported in Goya, provide an empty implementation of those functions. Signed-off-by: Omer Shpigelman <oshpigelman@habana.ai> Reviewed-by: Oded Gabbay <oded.gabbay@gmail.com> Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
* uapi: habanalabs: add signal/wait operationsOmer Shpigelman2020-05-192-1/+14
| | | | | | | | | | | | | | | | | | | | | | | | | | This is a pre-requisite to upstreaming GAUDI support. Signal/wait operations are done by the user to perform sync between two Primary Queues (PQs). The sync is done using the sync manager and it is usually resolved inside the device, but sometimes it can be resolved in the host, i.e. the user should be able to wait in the host until a signal has been completed. The mechanism to define signal and wait operations is done by the driver because it needs atomicity and serialization, which is already done in the driver when submitting work to the different queues. To implement this feature, the driver "takes" a couple of h/w resources, and this is reflected by the defines added to the uapi file. The signal/wait operations are done via the existing CS IOCTL, and they use the same data structure. There is a difference in the meaning of some of the parameters, and for that we added unions to make the code more readable. Signed-off-by: Omer Shpigelman <oshpigelman@habana.ai> Reviewed-by: Oded Gabbay <oded.gabbay@gmail.com> Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
* habanalabs: add missing MODULE_DEVICE_TABLEOded Gabbay2020-05-191-0/+1
| | | | | | | PCI drivers should use this define to declare their PCI ID table. Reviewed-by: Tomer Tayar <ttayar@habana.ai> Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
* habanalabs: print all CB handles as hex numbersDotan Barak2020-05-191-2/+2
| | | | | | | | | Make all the CB handles printed in the same way and not some as decimal and some as hex numbers. Signed-off-by: Dotan Barak <dbarak@habana.ai> Reviewed-by: Oded Gabbay <oded.gabbay@gmail.com> Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
* habanalabs: update F/W register mapOded Gabbay2020-05-191-21/+23
| | | | | | | | Update the mapping to the latest one used by the Firmware. No impact on the driver in this update. Reviewed-by: Tomer Tayar <ttayar@habana.ai> Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
* habanalabs: enable trace data compression (profiler)Adam Aharon2020-05-191-1/+1
| | | | | | | | | Set the STMTCSR.COMPEN bit to enable leading-zero trace data compression functionality for the extended stimulus ports. Signed-off-by: Adam Aharon <aaharon@habana.ai> Reviewed-by: Oded Gabbay <oded.gabbay@gmail.com> Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
* habanalabs: load CPU device boot loader from hostOfir Bitton2020-05-194-73/+77
| | | | | | | | | | | | Load CPU device boot loader during driver boot time in order to avoid flash write for every boot loader update. To preserve backward-compatibility, skip the device boot load if the device doesn't request it. Signed-off-by: Ofir Bitton <obitton@habana.ai> Reviewed-by: Oded Gabbay <oded.gabbay@gmail.com> Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
* habanalabs: leave space for 2xMSG_PROT in CBOded Gabbay2020-05-191-7/+17
| | | | | | | | | | The user must leave space for 2xMSG_PROT in the external CB, so adjust the define of max size accordingly. The driver, however, can still create a CB with the maximum size of 2MB. Therefore, we need to add a check specifically for the user requested size. Reviewed-by: Tomer Tayar <ttayar@habana.ai> Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
* habanalabs: support hwmon_reset_history attributeChristine Gharzuzi2020-05-193-3/+97
| | | | | | | | | Support hwmon_temp_reset_histroy, hwmon_in_reset_history and hwmon_curr_reset attribute which resets the historical highest value. Signed-off-by: Christine Gharzuzi <cgharzuzi@habana.ai> Reviewed-by: Oded Gabbay <oded.gabbay@gmail.com> Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
* habanalabs: Align protection bits configuration of all TPCsTomer Tayar2020-05-191-1/+98
| | | | | | | | | | | Align the protection bits configuration of all TPC cores to be as of TPC core 0. Fixes: a513f9a7eca5 ("habanalabs: make tpc registers secured") Signed-off-by: Tomer Tayar <ttayar@habana.ai> Reviewed-by: Oded Gabbay <oded.gabbay@gmail.com> Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
* habanalabs: Allow access to TPC LFSR registerTomer Tayar2020-05-191-1/+0
| | | | | | | | | Allow user access to TPC LFSR register, as it might be accessed by TPC kernels. Signed-off-by: Tomer Tayar <ttayar@habana.ai> Reviewed-by: Oded Gabbay <oded.gabbay@gmail.com> Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
* habanalabs: Add INFO IOCTL opcode for time sync informationTomer Tayar2020-05-196-1/+88
| | | | | | | | | | | Add a new opcode to the INFO IOCTL that retrieves the device time alongside the host time, to allow a user application that want to measure device time together with host time (such as a profiler) to synchronize these times. Signed-off-by: Tomer Tayar <ttayar@habana.ai> Reviewed-by: Oded Gabbay <oded.gabbay@gmail.com> Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
* habanalabs: hl_pci_set_dma_mask() can be statickbuild test robot2020-05-191-1/+1
| | | | | | | | set function to be static as it is not called from outside its file. Signed-off-by: kbuild test robot <lkp@intel.com> Reviewed-by: Oded Gabbay <oded.gabbay@gmail.com> Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
* habanalabs: handle barriers in DMA QMAN streamsOded Gabbay2020-05-175-4/+23
| | | | | | | | | | | | When we have DMA QMAN with multiple streams, we need to know whether the command buffer contains at least one DMA packet in order to configure the barriers correctly when adding the 2xMSG_PROT at the end of the JOB. If there is no DMA packet, then there is no need to put engine barrier. This is relevant only for GAUDI as GOYA doesn't have streams so the engine can't be busy by another stream. Reviewed-by: Tomer Tayar <ttayar@habana.ai> Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
* habanalabs: retrieve DMA mask indication from firmwareOded Gabbay2020-05-174-39/+53
| | | | | | | | | | Retrieve from the firmware the DMA mask value we need to set according to the device's PCI controller configuration. This is needed when working on POWER9 machines, as the device's PCI controller is configured in a different way in those machines. Reviewed-by: Tomer Tayar <ttayar@habana.ai> Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
* habanalabs: update firmware definitionsOded Gabbay2020-05-172-3/+49
| | | | | | | | | | | Add comments for the various errors and states of the firmware during boot. Add a mapping of a new register that will tell the driver whether the firmware executed the request from the driver or if it has encountered an error. Add a new enum for the possible values of this register. Reviewed-by: Omer Shpigelman <oshpigelman@habana.ai> Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
* habanalabs: increase timeout during resetOded Gabbay2020-05-171-1/+1
| | | | | | | | | | | | | | | | | When doing training, the DL framework (e.g. tensorflow) performs hundreds of thousands of memory allocations and mappings. In case the driver needs to perform hard-reset during training, the driver kills the application and unmaps all those memory allocations. Unfortunately, because of that large amount of mappings, the driver isn't able to do that in the current timeout (5 seconds). Therefore, increase the timeout significantly to 30 seconds to avoid situation where the driver resets the device with active mappings, which sometime can cause a kernel bug. BTW, it doesn't mean we will spend all the 30 seconds because the reset thread checks every one second if the unmap operation is done. Reviewed-by: Omer Shpigelman <oshpigelman@habana.ai> Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
* habanalabs: print warning when reset is requestedOded Gabbay2020-05-171-0/+4
| | | | | | | | | When the system administrator asks the driver to soft or hard reset the device through sysfs, the driver should display a warning in the kernel log to explain why it suddenly resets the device. Reviewed-by: Omer Shpigelman <oshpigelman@habana.ai> Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
* habanalabs: unify and improve device cpu initOded Gabbay2020-05-175-113/+220
| | | | | | | | | Move the code of device CPU initialization from being ASIC-Dependent to common code. In addition, add support for the new error reporting feature of the firmware boot code. Reviewed-by: Omer Shpigelman <oshpigelman@habana.ai> Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
* habanalabs: re-factor H/W queues initializationOmer Shpigelman2020-05-175-14/+29
| | | | | | | | | | | | | | | We want to remove the following restrictions/assumptions in our driver: 1. The H/W queue index is also the completion queue index. 2. The H/W queue index is also the IRQ number of the completion queue. 3. All queues of the same type have consecutive indexes. Therefore we add the support for H/W queues of the same type with nonconsecutive indexes and completion queue index and IRQ number different than the H/W queue index. Signed-off-by: Omer Shpigelman <oshpigelman@habana.ai> Reviewed-by: Oded Gabbay <oded.gabbay@gmail.com> Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
* habanalabs: remove stop-on-error flag from DMAOmer Shpigelman2020-05-174-3/+63
| | | | | | | | | | | | | Stop-on-error mode in DMA is useful as it stops the transaction immediately upon error e.g. page fault. But it may cause the next command submission to fail as is leaves the DMA in unstable state. Therefore we remove the stop-on-error configuration from the DMA. Stop-on-err is still available for debug. Signed-off-by: Omer Shpigelman <oshpigelman@habana.ai> Reviewed-by: Oded Gabbay <oded.gabbay@gmail.com> Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>