An AlDev measurement is actually a three-way measurement. At any time value, the number on the graph is the combination of the counter, the reference, and the device under test. Unless they're all similar, the result is mainly due to the worst of the three.
As awallin and VK5RC mentioned, measuring the counter's internal oscillator is a good way to check the counter's noise floor. The 5335A has a minimum resolution of 1 ns. Since counters typically do better than their spec, you should expect to see a 1 sec. AlDev value that's somewhat less than 1e-9. For larger time values, the graph will extend down and to the right on the graph i.e. 1e-10@10sec, 1e-11@100sec, etc. Eventually it will become a horizontal line, but you might have to wait a crazy amount of time for that. This test gives the best numbers you'll ever see with that counter and it coult be argued that it isn't a 'real world' number since you're measuring a source that's synchronized with the counter's timebase.
A more 'real world' noise floor measurement is to take an external oscillator (e.g. your 10811) and connect it to Channel A via a T connector. Then go through a cable to Channel B. This guarantees that there will be a delay between Channel A and B. Now measure the delay and calculate the AlDev. Make sure that the cable doesn't move during the measurement and keep the temperature stable. The results might be similar to or maybe somewhat worse than the above measurements.
If the results of either of the above tests are signicantly better than expected, you've got some setting wrong and you need to sort that out before you start making real measurements.
Wherever possible, you should use time interval measurements rather than frequency. When you use frequency, you're averaging the measurment over whatever gate interval you select. This can obscure the information you're trying to measure. Time interval measurements measure a single event. Some counters can average time interval measurements - make sure you don't do that. Also, some counters measure time interval better than they measure frequency due to their internal architecture. My HP 5370B and Wavecrest DTS-2077 are like that.
It's usually a good idea to divide the signals down to a lower frequency before making the measurements. This prevents the phase offset between the signals from wrapping around. These wraps must be removed before analysis. Timelab does it automatically, but it's still better to start with clean data. If it's not convenient to divide them down, just keep it in mind and try to adjust the frequencies to minimize phase wraps.
Once you've figured out how to make reasonable measurements you can start to understand what they mean.
Ed