Someone told me to use either derive_pll_clocks OR set_max_delay of 1.25 clock periods OR set_multicycle_path instead of just a brute set_false_path command which literally says that the signal could take 1 ns, or it could take 1 ms, and the user don't care.
Never use false path unless you do not care for random errors from build to built as that path delay may fall inside one clock, or be accepted at the next clock, or even 2 more since the compiler will not care how slow a path takes from A-B and it may just get routed any which way on the fabric. You use false path when you dont have any wiring from those 2 clock domains or dont care at all.
Only use multicycle, for example a setting of 2, if your code has been coded to allow for a 1 clock or 2 clock data delay. This will be random every build and worse even random between multiple controls and bits if you do it between multiple clock domains. So, use carefully and code with the error intent if you do.
Max delay should only be used for the IO pins and not internal paths as again, each silicon fabric can perform better than what the compiler targets as worst case scenario while the IOs may have specific rigid timing wiring & transistors to achieve a stable external interfacing. If you are the master sending the clock and all controls to the DDR3, you may actually have a pretty large delay from internal clock to the IO pins, so long as that delay is globally flat between all transmitting IOs. This will generate havoc with regard to reading data back into the FPGA.
To solve your internal core to IOBUF timing issues, have you tried making a set or series latches/chain to give the compiler the ability to graciously cross the clock domain on it's own terms without having to do any such tricks? In my controller, my write data path going from CK_0 internally to the IO buffer's CK_90 first goes through a 3-4 DFF chain, the entire data DQ and DM and OE busses before it reaches my DDR_IOBUFF. Without this chain, (this means all that data is also valid and ready 3-4 clocks early) Quartus would cripple my write data path (it actually cripples my core's CK_0 FMAX, not the actual CK_90 clock) down to ~75MHz, or generate a huge negative error slacks in the NS range instead of my current >400MHz clearance.
(In my code,'BrianHG_DDR3_IO_PORT_ALTERA.sv (v0.95 build)', lines 595 to end has this pipe running on at CK_90, while lines 550-595 has the write BL8 data serializer running on my core system clock CK_0.)