1. hipDeviceGetLimit API for HCC path is added
2. Test for hipDeviceGetLimit API is added
3. The feature added only supports querying heap size
4. Corrected indents for malloc and free device functions
5. Removed redundant data structures
6. Added g_heap_malloc_size to store the heap size
Change-Id: If48d1b0ce9270e994f1c542cc283ddbb14746bbb
1. Added malloc and free device functions
2. Added test which check malloc and free functions
TODO: Need to add support for multiple device. Works only on one device (multi device support id NOT available).
Change-Id: Id11fc36463915d6ad46c264d5a20c8feb2d2c17c
- Minimize time that locks are held.
- Eliminate copy code that locked stream and ctx at same time.
- Stream was locked to ensure thread-safe enqueue to the queue.
- Devices were locked to query peer-lists.
Change-Id: Ibe8880bb7fb995a3da8f90ff911f212d81525018
Remove memory leak with new hc::completion_future.
Implement HIP_LAUNCH_BLOCKING with queue-level wait.
Change-Id: I45975f81c4d239fdeed7776970988d28449865dc
- Bug fix for peer visibility. Now contexts correctly detect when they can use SDMA for P2P vs staging buffers.
- Interface to new HCC copy_ext function.
- Improve context and peer print /debug options.
- Add comments and usage to hipPeerToPeer_simple test.
Code simplification/cleanup:
Remove stale fallback paths that uses something besides the unpinned engine.
Remove HIP_STAGING_BUFFER env var - now is const 2, 0 no longer has
special meaning.
Change-Id: I7d24cdd1067dd0c244e87b6a83897cb135d307e7
- Fix calls to HIP_INIT_API to pass all function arguments.
- Change ihipFunction to follow coding convention:
- leading underscore for member fields,
- camelCase for member fields.
- move kernel print function inside ihipPreLaunchKernel.
- add HIP_TRACE_API_COLOR, control color of messages.
- add ihipLogStatus wrapper to hipDeviceSynchronize()
Change-Id: I20bbb644da213f821404648945197254e3648fc9
- move _av into stream critical section. ( HCC accelerator_view is not
thread-safe but HIP steram is. )
- Refactored many places in code that need to acquire critical section.
some were previously thread races, ie enqueueing marker.
-remove support for GRID_LAUNCH_VERSION < 20
-Enable USE_AV_COPY based on HCC work-week.
- Review hipModule docs, some calrity/editing.
Change-Id: I3ce7c25ece048c3504f55ecd4683e506bb1fc8b6
- Use standard print functions for streams.
- Add HIP_INIT macro, for cases where we want to initialize HIP but not
log an API (ihipPreKernelLaunch).
Change-Id: If43cf8a363d918bcd3722a2e6a965d4cfa2e03e7
- The module kernel launch is now in sync with commands in its stream
- Moved launch kernel inside ihipStream
Change-Id: Ic00cfcf4882bf81b6203c36881a52575ea68b529
APIs: hipInit, hipCtxCreate.
Track TLS default ctx. Set deviceID now changes the ctx.
Add first context test.
Change-Id: If1cb9989b5a04a36147e25e84904336c7b6f3d88
Change ihipDevice_t -> ihipCtx_t (new)
Change ihipGetTlsDefaultDevice->ihipGetTlsDefaultCtx
Some other changes from device->ctx where appropriate.
Change-Id: I5c4ae93b2fd42c6303aa23d748eb166b7431925d
1. Before, the signal pool is increased depending on the usage
2. After, a static number of signals are allocated to the pool
Only these are used by hip in a stream
3. If the signals required are more than the pool size, the
stream has to wait to make sure all the signals are available
4. Once they are available, the stream can use them
5. Removed HIP_NUM_SIGNALS_PER_STREAM because of redundancy with HIP_STREAM_SIGNALS
6. Increased signal count from 2 to 32.
Future Work: Dynamically increase the pool size depending on the number of
streams allocated by the application. And, null stream should have more signals
Change-Id: I6be36e084f26bb04766fabf776c7210aee0f9e91