Are you thinking the D1s are faults? The datasheet shows an overlap for valid data at the end of the MREQ / IOREQ cycle, but the timing charts quote 0 ns for that time for a 10 MHz Z80 CPU, so in reality there is no need for the data to be held past the end of the MREQ/IOREQ cycles.
It does seem that z80_DATA_DIR and z80_BDIR_EN are HIGH for a long time before the Z80 actually needs to read the data off the bus.
The way you read the data sheet, for the data in and data out, the small time slot where the Z80 has the 'D1' means that you need to present valid data to the Z80 beginning anytime before that time slot begins, and then hold data on the bus until after the 'D1' time slot ends.
Now, do we send data to the Z80, figuring it takes an additional 5-10ns for the data to get there from the assertion of the OE & data out signals, will the data reach the Z80 in time to ensure a good read?
Now, do we hold that data we are sending out to the Z80 long enough until the end of the Z80 'D1' time slot that we guarantee the Z80 will receive a correct byte?
Remember, the time scale on the simulation is accurate, so you have the ns ruler at the to. In simulate in timing mode and it will even be closer. (Adds around up to another 2.5ns of delay on the FPGA outputs.)
Also, about the wait state, if you look at the port read and write, there is an extra Z80 clock cycle, wait state which you may need to wait before beginning a transaction just to be safe.
You have the function in your Z80 module, but not a basic setup where you can manipulate a response. You need to clean that up.
In the other direction, a Z80 write, you should be taking the data from the Z80 data bus
after the valid 'D0' time has begun and before it ends. Now there is an additional caveat here too. Yo need to consider all the delays created by you bus' potential load on the data lines as well as the address lines.