Okay, I misunderstood the source of the data. I wouldn't have asked if I did understand. I thought the point of the project was a new learning model/methodology in and of itself.
Thanks for taking the time. Now I'm a little less ignorant.
"The git" is just a term some of the programmers I've worked with on FC and ESC projects used to refer to any specific repository. I figured it was common slang.
mnem
No worries.
It will be a bit more difficult anyway.
This FPGA gizmo comes on a PCI-E FPGA mining card which means that it does not have the fancy connectors that you would want for all those I/O functions you'd normally want to use.
This means that I have to find a way to repurpose the card to
1) use as a more or less generic accellerator card
2) find a way to access it from Vivado
3) pytorch is dead as far as I am concerned (and as NMT goes). The framework to be used is tensorflow.
4) hack OpenNMT to use tensorflow, integrate it into my toolchain and do a PoC tensorflow training for a popular language, let's say Klingon
5) set up some basic test suite to deploy the tensorflow model (and test it), ideally with a REST API
6) remodel that stuff and check if I can use the xDNN or the FINN toolkit to generate deployable FPGA models with it
7) test the POC on the FPGA
if successful run a real life test with a larger model (let's say English -> French with 15 million data sets out of a 275 million pool)
9) check if training can be ported to such FPGA (if not, well, too bad ...)
10) benchmark it against what we have now (4x Tesla P100 for training, 400 CPU cores for the deployed models)
11) find someone who pays for all of that ...