If you're intending to pipeline your logic, then that's a different issue, because you then need to determine what the intermediate steps should be. Even then, the logic to actually describe those steps can be as verbose as you like - it doesn't cost anything.
I don't understand this "pipelining". What does exactly mean?
Suppose you have a complex logical function that you want to execute; for example, say you have 100 bits you want to XOR together.
One way to do it is to build logic that can do it all in one go. You might quite reasonably build ten 10-input XOR gates, then join all their outputs to the inputs of an 11th gate. The output of that 11th gate is the final output you wanted, but it's taken two gates' worth of propagation delay to get there. Since the maximum clock speed of any domain is limited by the slowest device connected to it, your system clock can't have a period any shorter than 2x your gate delay.
But: suppose you latch the outputs of the first gates into a set of D-types, then feed the outputs of those D-types into the inputs of the second gate. The final output comes from that second gate, and is valid after two clocks.
It still takes just as long (probably longer, in fact) to get the final output, but now your master clock can run twice as fast. The 100-input (first) stage can process a new block of data at the same time as the 10-input (second) stage is processing the intermediate results from the previous clock. This splitting up of logic into blocks, with memory bits used to store intermediate results, is pipelining.
With pipelining, although the latency through the system may be longer than without it, the throughput can be greatly increased.
In a simple CPU, the pipeline may consist of steps like: fetch instruction from SRAM, decode instruction into flags which control things like the ALU or stack, execute the instruction, and store results back to RAM. While each instruction is being executed, the next one is being fetched, and the results of the previous one are being written.
The trick in this case is to try and work out what useful intermediate results to produce from the original instruction word. That's something the synthesizer can't do because it's a behavioural change (ie. make the logic
do thing A vs thing B), rather than just finding an efficient way to state a defined mapping between inputs and outputs.