* Enabling clique for any XGMI-connected topology, adding tuning * Updating CHANGELOG for clique tuning * Re-working clique barrier system to work on multi-process / multi-gpu