1. **Lattice QCD package GWU\-code and QUDA with HIP**, Yu\-Jiang Bi, Yi Xiao, Wei\-Yi Guo, Ming Gong, Peng Sun, Shun Xu, and Yi\-Bo Yang, *PoS* **LATTICE2019** 286 (2020) [arXiv:2001.05706](https://arxiv.org/abs/2001.05706) [doi:10.22323/1.363.0286](https://doi.org/10.22323/1.363.0286)

2. **Multigrid algorithm for staggered lattice fermions**, Richard C\. Brower, M\. A\. Clark, Alexei Strelchenko, and Evan Weinberg, *Phys\. Rev\. D* **97** (11) 114513 (2018) [arXiv:1801.07823](https://arxiv.org/abs/1801.07823) [doi:10.1103/PhysRevD.97.114513](https://doi.org/10.1103/PhysRevD.97.114513)

3. **Accelerating lattice QCD multigrid on GPUs using fine\-grained parallelization**, M\. A\. Clark, B\\'alint Jo\\'o, Alexei Strelchenko, Michael Cheng, Arjun Gambhir, and Richard\. C\. Brower, *International Conference for High Performance Computing, Networking, Storage and Analysis* (2016) [arXiv:1612.07873](https://arxiv.org/abs/1612.07873) [doi:10.5555/3014904.3014995](https://doi.org/10.5555/3014904.3014995)

4. **Scaling lattice QCD beyond 100 GPUs**, R\. Babich, M\. A\. Clark, B\. Joo, G\. Shi, R\. C\. Brower, and S\. Gottlieb, *International Conference for High Performance Computing, Networking, Storage and Analysis* (2011) [arXiv:1109.2935](https://arxiv.org/abs/1109.2935) [doi:10.1145/2063384.2063478](https://doi.org/10.1145/2063384.2063478)

5. **Efficient implementation of the overlap operator on multi\-GPUs**, Andrei Alexandru, Michael Lujan, Craig Pelissier, Ben Gamari, and Frank X\. Lee, *2011 Symposium on Application Accelerators in High\-Performance Computing \(SAAHPC'11\)* 123–130 (2011) [arXiv:1106.4964](https://arxiv.org/abs/1106.4964) [doi:10.1109/SAAHPC.2011.13](https://doi.org/10.1109/SAAHPC.2011.13)

6. **Multi\-mass solvers for lattice QCD on GPUs**, A\. Alexandru, C\. Pelissier, B\. Gamari, and F\. Lee, *J\. Comput\. Phys\.* **231** 1866–1878 (2012) [arXiv:1103.5103](https://arxiv.org/abs/1103.5103) [doi:10.1016/j.jcp.2011.11.003](https://doi.org/10.1016/j.jcp.2011.11.003)

7. **Overlap Valence on 2\+1 Flavor Domain Wall Fermion Configurations with Deflation and Low\-mode Substitution**, A\. Li, and others, *Phys\. Rev\. D* **82** 114501 (2010) [arXiv:1005.5424](https://arxiv.org/abs/1005.5424) [doi:10.1103/PhysRevD.82.114501](https://doi.org/10.1103/PhysRevD.82.114501)

8. **Solving Lattice QCD systems of equations using mixed precision solvers on GPUs**, M\. A\. Clark, R\. Babich, K\. Barros, R\. C\. Brower, and C\. Rebbi, *Comput\. Phys\. Commun\.* **181** 1517–1528 (2010) [arXiv:0911.3191](https://arxiv.org/abs/0911.3191) [doi:10.1016/j.cpc.2010.05.002](https://doi.org/10.1016/j.cpc.2010.05.002)

9. **Highly improved staggered quarks on the lattice, with applications to charm physics**, E\. Follana, Q\. Mason, C\. Davies, K\. Hornbostel, G\. P\. Lepage, J\. Shigemitsu, H\. Trottier, and K\. Wong, *Phys\. Rev\. D* **75** 054502 (2007) [arXiv:hep-lat/0610092](https://arxiv.org/abs/hep-lat/0610092) [doi:10.1103/PhysRevD.75.054502](https://doi.org/10.1103/PhysRevD.75.054502)

10. **The Chroma software system for lattice QCD**, Robert G\. Edwards, and Balint Joo, *Nucl\. Phys\. B Proc\. Suppl\.* **140** 832 (2005) [arXiv:hep-lat/0409003](https://arxiv.org/abs/hep-lat/0409003) [doi:10.1016/j.nuclphysbps.2004.11.254](https://doi.org/10.1016/j.nuclphysbps.2004.11.254)

11. **Krylov space solvers for shifted linear systems**, Beat Jegerlehner (1996) [arXiv:hep-lat/9612014](https://arxiv.org/abs/hep-lat/9612014)
