Author Topic: How to receive a file from a web server into a low-RAM embedded target with LWIP  (Read 1004 times)

0 Members and 1 Guest are viewing this topic.

Online peter-hTopic starter

  • Super Contributor
  • ***
  • Posts: 4137
  • Country: gb
  • Doing electronics since the 1960s...
This refers
https://www.eevblog.com/forum/programming/do-httphttps-servers-ever-enforce-multipart-for-file-downloads/

I have been doing this previously between a simple HTTP server (which I wrote myself, using the netconn API of LWIP) and a client browser, for file transfer. In that case, the client end was running javascript, to enable stuff like a progress bar (which is not possible in plain HTTP, for browser -> server, and only primitively for server -> browser) and to avoid the multipart stuff which browsers do when sending files up (apparently).

But this is different, because of the way the LWIP API works when receiving data. You send the file request and then you wait for the header:

> $ telnet xxxx.com 80
> Trying x.x.x.x...
> Connected to xxxx.com.
> Escape character is '^]'.
> GET /how/yyyy/glide.png HTTP/1.1
> Host: xxxx.com
>
> HTTP/1.1 200 OK
> Date: Sat, 26 Aug 2023 14:10:40 GMT
> Server: Apache/2.4.41 (Ubuntu)
> Last-Modified: Sun, 06 Apr 2014 08:20:34 GMT
> ETag: "yyyyyyyyyyyyyy"
> Accept-Ranges: bytes
> Content-Length: 7901
> Content-Type: image/png
>
> <89>PNG^Z (...)

The received data starts with the "HTTP/1.1 200 OK".

This is easy on a PC client. You malloc a few MB of RAM and use the socket API to read stuff into that.

But how would you do this on an embedded target which is the client. Say you have only a few k of RAM to play with, and you want to download a 1MB file. The LWIP API (whether netconn or socket) will return a buffer with some data. You get a pointer to it. The simple way is to assume the whole file header is in there, possibly followed by 1 or more bytes of data. Normally this is a fair assumption; the packet will be one MTU. But of course this assumption could be wrong, if the link is very slow.

So one can improve it by parsing bytes, one at a time, until you see "HTTP" and then you wait say 1 second, and then it is even safer to assume the whole file header is in the buffer. You need timeouts anyway. Once you have the header you have the byte count and then you can take it from there (with timeouts etc).

This bit of my code shows what I am doing. It is actually from a different context (receiving data from an edit box, implemented with JS) but it is just the same idea

Code: [Select]

/*
 *
 * This function receives the PUT data from a JS-submitted textarea box.
 * The PUT etc is generated by a JS script.
 * We get here only if data has arrived from client browser i.e. from netconn_recv
 * Buffer buf already contains first packet and starts with:
 *
 * PUT /ufile=BOOT.TXT HTTP/1.1..Host: 192.168.3.73:81..User-Agent: Mozilla/5.0 (Windows NT 6.1; Win64;
 * x64; rv:103.0) Gecko/20100101 Firefox/103.0..Accept: .....Accept-Language: en-US,en;q=0.5..Accept-En
 * coding: gzip, deflate..Content-Type: text/plain;charset=UTF-8..Content-Length: 198..Origin: http * p
 * ://192.168.3.73:81..Connection: keep-alive..Referer: http://192.168.3.73:81/efile?BOOT.TXT....boot
 * time: 2022-08-02 15:39:28.app name: appname_1.1
 *
 * The likelihood of the 1st packet containing the entire header including the CRLFCRLF
 * is dependent on the value PBUF_POOL_BUFSIZE on lwipopts.h, and on the program sending
 * the data (the browser).
 *
 * We get here with the filename at buf[11].
 * To get data size, search for Content-Length:
 * To find the start of the data, search for CRLFCRLF, and extract Content-Length of it.
 *
 * The actual file data is after a CRLFCRLF and is quite likely in this buffer, to some extent.
 *
 * This function writes data directly from LWIP's buffers to the filesystem, so there
 * is no 512 byte etc limitation there.
 *
 */

static void EditGetData(struct netconn *conn, char *buf, uint16_t buflen)
{

struct netbuf *nbuf = NULL; // address of a netbuf
err_t err = ERR_RST; // some initial value ("connection closed")
char filename[20];
char filesizebuf[20];
uint32_t filesize=0; // size extracted from client header
bool found3=true; // false if any of the 3 items not found in header
char * ptr = NULL;
FIL fp;
UINT actual_length=0; // #bytes actually written to file
uint32_t total_written=0; // accumulation of above
bool wr_fail = false;

// Buf already holds the first load of data. Length is buflen. Parsing this assumes that
// - the relevant part of the client's header is in buf
// - enough data has actually arrived for the above to be true (might need a short delay)
// The lwipopts.h PBUF_POOL_BUFSIZE parameter has a direct bearing on this and needs to be >500

// Extract filename
strncpy(filename,&buf[11],sizeof(filename)); // copy over filename, ' ' terminated
ptr = strnstr(filename," ",sizeof(filename));
if (ptr != NULL)
{
*(uint32_t*) ptr = 0; // null-terminate filename (replace '=' with 0)
}
else
{
found3=false;
}

// Extract file size; limit search for "Content-Length:" to some plausible value (MTU)
ptr = strnstr(buf,"Content-Length:",1500);
if ( (ptr != NULL) && found3 )
{
strncpy(filesizebuf, 15+ptr, sizeof(filesizebuf));
filesize=atoi(filesizebuf);
}
else
{
found3=false;
}

// Extract the portion of the file in buf. Typically this is at buf+400 or so
// It could be zero (if the CRLFCRLF is found right at the end) and this actually happens with Edge!
uint32_t foffset=0; // offset of where file data starts in buf
int32_t flen=0; // size of file data
ptr = strnstr(buf,CRLFCRLF,1500);
if ( (ptr != NULL) && found3 )
{
foffset = 4+(uint32_t)(ptr-buf); // 4 to skip CRLFCRLF
flen = buflen-foffset;
if (flen<0) found3=false;
}
else
{
found3=false;
}

Obviously this is a hack, but how would it be done "properly"? The only way I can think of is a "one byte at a time" state machine, with a timeout at each byte.

The netconn buffer size can be extracted from this (recv_avail?)

Code: [Select]
/** A netconn descriptor */
struct netconn {
  /** type of the netconn (TCP, UDP or RAW) */
  enum netconn_type type;
  /** current state of the netconn */
  enum netconn_state state;
  /** the lwIP internal protocol control block */
  union {
    struct ip_pcb  *ip;
    struct tcp_pcb *tcp;
    struct udp_pcb *udp;
    struct raw_pcb *raw;
  } pcb;
  /** the last error this netconn had */
  err_t last_err;
#if !LWIP_NETCONN_SEM_PER_THREAD
  /** sem that is used to synchronously execute functions in the core context */
  sys_sem_t op_completed;
#endif
  /** mbox where received packets are stored until they are fetched
      by the netconn application thread (can grow quite big) */
  sys_mbox_t recvmbox;
#if LWIP_TCP
  /** mbox where new connections are stored until processed
      by the application thread */
  sys_mbox_t acceptmbox;
#endif /* LWIP_TCP */
  /** only used for socket layer */
#if LWIP_SOCKET
  int socket;
#endif /* LWIP_SOCKET */
#if LWIP_SO_SNDTIMEO
  /** timeout to wait for sending data (which means enqueueing data for sending
      in internal buffers) in milliseconds */
  s32_t send_timeout;
#endif /* LWIP_SO_RCVTIMEO */
#if LWIP_SO_RCVTIMEO
  /** timeout in milliseconds to wait for new data to be received
      (or connections to arrive for listening netconns) */
  int recv_timeout;
#endif /* LWIP_SO_RCVTIMEO */
#if LWIP_SO_RCVBUF
  /** maximum amount of bytes queued in recvmbox
      not used for TCP: adjust TCP_WND instead! */
  int recv_bufsize;
  /** number of bytes currently in recvmbox to be received,
      tested against recv_bufsize to limit bytes on recvmbox
      for UDP and RAW, used for FIONREAD */
  int recv_avail;
#endif /* LWIP_SO_RCVBUF */
#if LWIP_SO_LINGER
   /** values <0 mean linger is disabled, values > 0 are seconds to linger */
  s16_t linger;
#endif /* LWIP_SO_LINGER */
  /** flags holding more netconn-internal state, see NETCONN_FLAG_* defines */
  u8_t flags;
#if LWIP_TCP
  /** TCP: when data passed to netconn_write doesn't fit into the send buffer,
      this temporarily stores how much is already sent. */
  size_t write_offset;
  /** TCP: when data passed to netconn_write doesn't fit into the send buffer,
      this temporarily stores the message.
      Also used during connect and close. */
  struct api_msg *current_msg;
#endif /* LWIP_TCP */
  /** A callback function that is informed about events for this netconn */
  netconn_callback callback;
};

but that doesn't really help much. What if recv_avail is less than the number of bytes required as a minimum to extract the file header from? One then needs a local buffer of at least that size and build up the data in that. But the buffer needs to be bigger; it needs to be at least the size "recv_avail" because you cannot "un-read" the data, so you have to extract everything in the LWIP buffer before doing any more reading. But AFAICT you cannot determine the upper bound on recv_avail, except indirectly, either via the MTU value or via the LWIP buffer config (PBUF_POOL_BUFSIZE).

With the socket API you get the same issue.

All code samples I have seen just chuck a load of RAM at this.
Z80 Z180 Z280 Z8 S8 8031 8051 H8/300 H8/500 80x86 90S1200 32F417
 

Offline ejeffrey

  • Super Contributor
  • ***
  • Posts: 3923
  • Country: us
An easy and mostly acceptable way to do it is to define a maximum line size for the header, and create a global buffer of that size.  When you get a new buffer from the network, you copy into the line buffer until you get to EOL.  If you get to EOL, you process the line to look for any headers you care about (e.g., content length).  If there is any data left in the buffer, you start over.  Whatever left-over data you have that isn't a full line you store in the buffer and wait for the next message from the network API.

Since you probably don't care about a lot of the request headers, as soon as you get to the ':' separator you can decide if you care about that header and if not just discard the rest of the line without even copying it to the buffer.
 


Share me

Digg  Facebook  SlashDot  Delicious  Technorati  Twitter  Google  Yahoo
Smf