...
Update 26/07
CELLVOLTAGE_AVERAGE_WINDOW_SIZE change from 4 to 8 allows me to run high suction mode. Max suction mode is not working as measured voltage at peak is 19.76 before it cuts the power off. Increasing the average windows size does not help and high suction stops when size 20 is set. I am happy with the result as I only use high suction.
In terms of temperature I modified MAX_CHARGE_TEMP_C = from 50 to 59.I tried 55 but still caused a trouble. Part of the issue could be the silicon glue over the IC as this is another thermal barrier.
Thanks for reporting your results! I'm really glad to hear you got it working for you.
I somewhat surprised that increasing the averaging window made a difference. The way the code works is that it keeps a log of the last X cell voltage measurements, where X = CELLVOLTAGE_AVERAGE_WINDOW_SIZE. Then when the code asks for the voltage of a specific cell, it sums all of the past cell voltages for that cell and divides by CELLVOLTAGE_AVERAGE_WINDOW_SIZE to get an average. The trick is that during initial startup, many of the past cell measurement data points will be zero, for example, the average of [4.1, 0, 0, 0] is 1.025V, which would instantly cause an under-voltage error. To avoid this, the code only divides by the number of number of data points collected (variable num_iterations) up to the window size. So the first time the data for a cell might be [4.1, 0, 0, 0], divided by 1, giving average of 4.1. Second time might have data [4.1, 4.08, 0, 0], divided by two, giving an average of 4.09, etc.
As you can see, increasing the averaging window size will smooth out voltage fluctuations, but only after all historical data points are populated. It shouldn't make any difference during startup (where startup = you just pulled the trigger, the BMS woke up, and needs to immediately enable the output if all safety checks pass).
I'd have to check some other time with the debugger, but the cell voltage history array probably has 2-3 data points collected before the output is enabled, just due to the number of times the main loop executes as it goes from init > idle > output_en states. There might also be a small delay since the code also waits for the ISL94208 to report the trigger is pressed, in addition to the PIC checking that input too. Perhaps increasing CELLVOLTAGE_AVERAGE_WINDOW_SIZE to 8 is just enough to keep a few more good data points in the average calculation, so the under-voltage cut out doesn't trip. I have no idea why increasing the window size to 20 would make things worse than setting it to 8 though. I don't think it would be running out of RAM and I also don't think the calculation would be taking so long that the watchdog is kicking in.
I've briefly considered possible solutions to averaging-not-helping-at-startup issue, but I haven't come up with any great solutions.
Possible non-ideal solutions:
- Wait until the cell voltage history array is full before enabling the output, so any inrush current induced voltage drop can be averaged out by the other n-1 good data points
- Disable under-voltage protection for a short period after the output is enabled
- Maybe just lower the under-voltage cutout voltage for a short time after output is first enabled?
- Populate the cell voltage history will "normal" values like 3.6V for all array values that aren't collected yet. So the data might be [4.1, 3.6, 3.6, 3.6] instead of [4.1, 0, 0, 0]
- Add some delay to the under voltage cut out or require multiple consecutive under-voltage readings before kicking in.
Number 3 is probably the best option and might not be too invasive to implement.
Now that you have the vacuum working, if you feel like tinkering with it some more, you could measure your cell ESR by measuring the cell voltage drop and current draw of the vacuum simultaneously. For example, you are measuring cell 1, connect a multimeter to cell 1 and measure the no-load voltage. Let's say you measure 4.12V. Now you start the vacuum and measure that it is drawing 3.27A, and at the same time the vacuum is drawing that current, cell 1 measures 3.74V. You can approximate the ESR of cell 1 as the voltage drop / current. (V/I = R). So 4.12V-3.74V = 380mV / 3.27A = 116mOhm ESR of cell 1. I think I read this isn't the
ideal way to measure ESR, but I think it's probably a good approximation. You could then calculate that if cell 1 has an ESR of 116mOhm, and you put the vacuum in max. mode and draw 17A, your voltage drop due to ESR would be 17A * 116mOhm or 1.972V! Meaning, your no-load cell voltage of 4.12V would drop to 2.148V! Those are all rough calculations, but that ESR calc would tell us that no amount of averaging will fix the fact that you can't draw 17A from this hypothetical cell and still have a cell voltage above 3V. You would want to repeat the testing/measurements for each cell, but you can apply the test load across the entire pack.
My long rant aside, another option you could tweak is to reduce MIN_DISCHARGE_CELL_VOLTAGE_mV to something like 2700 or maybe even 2500. This might help with both the inrush current voltage droops, and any cells with poor ESR in general.
Also, thanks for the information on the charging over-temp. I'm curious, what was/is your ambient temperature when charging? All my testing was done at 21C ambient and with the plastic case removed, so I could totally see issues like yours popping up when the case installed and with a higher ambient temp. Also, just to double check, did it take a few minutes, or at least like 30 seconds before you were encountering the charging over-temp error? That would also indicate it just heating up and I have the limit set too low. If it errors out instantly, that might be something totally different.
One last thing:
Interesting thing happened by accident when I put the battery without a trigger mechanism (silly me). Motor has started and run in high suction mode. This is only achievable when the battery pack goes into sleep mode. I was trying to repeat the same using the trigger button but no success. I run the motor for 39 min straight, with no filter and no attachment fitted .
Does this mean that somehow the BMS was asleep (LED turned off) but the output was still on? If so, that's troubling to me since it means somehow the output was enabled but all protection features would be off. That would mean there might be a serious bug in my code, since that shouldn't be possible.
Could you provide any more details on what exactly happened?