Fork us on GitHub Follow us on Facebook Follow us on Twitter

Opened 5 years ago

Closed 5 years ago

#592 closed defect (fixed)

tcp keeps creating new memory areas

Reported by: Jiri Svoboda Owned by: Jakub Jermář
Priority: major Milestone: 0.6.0
Component: helenos/lib/c Version: mainline
Keywords: Cc:
Blocker for: Depends on:
See also:

Description (last modified by Jiri Svoboda)

tcp has quite a few memory areas and a few (1-10) are created every time websrv serves a file. The number of new memory areas does not appear to depend on the size of the file. It looks more like this happens per connection.

Some of the new areas have 1 MB (220 B) size, others have 8kB size (this is on ia32).

As far as I know TCP does not leak heap blocks anymore. Note that for each connection we create two fibril timers (and thus two fibrils) and destroy the, when the connection is terminated. So this might be an issue with malloc() and/or with fibril stack allocation.

After I serve /index.html a few times, taskdump on tcp gives:

Task Dump Utility
Dumping task 'tcp' (task ID 32).
failed opening file
Loaded symbol table from /srv/tcp

Threads:
 [1] hash: 0x8757a000
Thread 0x8757a000: PC = 0x00022899 (ra+0). FP = 0x00132f0c
  0x00132f0c: 0x00022899 (ra+0)
  0x00132f58: 0x00740650 (loc_callback_created+7406484)
  0x00132fd8: 0x000179a1 (async_manager_fibril+161)
  0x00132ff8: 0x0000afd9 (fibril_main+25)

Address space areas:
 [1] flags: R-XC base: 0x00001000 size: 188416
 [2] flags: RW-C base: 0x0002f000 size: 8192
 [3] flags: RW-C base: 0x00031000 size: 4096
 [4] flags: RW-C base: 0x00033000 size: 1048576
 [5] flags: RW-C base: 0x00134000 size: 16384
 [6] flags: RW-C base: 0x00139000 size: 1048576
 [7] flags: RW-C base: 0x0023a000 size: 1048576
 [8] flags: RW-C base: 0x0033b000 size: 1048576
 [9] flags: RW-C base: 0x0043c000 size: 1048576
 [10] flags: RW-C base: 0x0053d000 size: 1048576
 [11] flags: RW-C base: 0x0063e000 size: 1048576
 [12] flags: RW-C base: 0x0073f000 size: 8192
 [13] flags: RW-C base: 0x00742000 size: 1048576
 [14] flags: RW-C base: 0x00843000 size: 1048576
 [15] flags: RW-C base: 0x00944000 size: 16384
 [16] flags: RW-C base: 0x00949000 size: 1048576
 [17] flags: RW-C base: 0x00a4a000 size: 1048576
 [18] flags: RW-C base: 0x00b4b000 size: 8192
 [19] flags: RW-C base: 0x00b4e000 size: 1048576
 [20] flags: RW-C base: 0x00c4f000 size: 1048576
 [21] flags: RW-C base: 0x00d50000 size: 16384
 [22] flags: RW-C base: 0x00d55000 size: 1048576
 [23] flags: RW-C base: 0x00e56000 size: 1048576
 [24] flags: RW-C base: 0x00f57000 size: 8192
 [25] flags: RW-C base: 0x00f5a000 size: 1048576
 [26] flags: RW-C base: 0x0105b000 size: 1048576
 [27] flags: RW-C base: 0x0115c000 size: 1048576
 [28] flags: RW-C base: 0x0125d000 size: 1048576
 [29] flags: RW-C base: 0x0135e000 size: 16384
 [30] flags: RW-C base: 0x01363000 size: 1048576
 [31] flags: RW-C base: 0x01464000 size: 1048576
 [32] flags: RW-C base: 0x01565000 size: 8192
 [33] flags: RW-C base: 0x01568000 size: 1048576
 [34] flags: RW-C base: 0x01669000 size: 1048576
 [35] flags: RW-C base: 0x0176a000 size: 16384
 [36] flags: RW-C base: 0x0176f000 size: 1048576
 [37] flags: R--C base: 0x01870000 size: 4096
 [38] flags: RW-C base: 0x01874000 size: 1048576
 [39] flags: RW-C base: 0x01975000 size: 8192
 [40] flags: RW-C base: 0x01978000 size: 1048576
 [41] flags: RW-C base: 0x01a79000 size: 1048576
 [42] flags: RW-C base: 0x01b7a000 size: 16384
 [43] flags: RW-C base: 0x01b7f000 size: 1048576
 [44] flags: RW-C base: 0x01c80000 size: 1048576
 [45] flags: RW-C base: 0x01d81000 size: 8192
 [46] flags: RW-C base: 0x01d84000 size: 1048576
 [47] flags: RW-C base: 0x01e85000 size: 8192
 [48] flags: RW-C base: 0x01e88000 size: 1048576
 [49] flags: RW-C base: 0x01f89000 size: 8192
 [50] flags: RW-C base: 0x01f8c000 size: 1048576
 [51] flags: RW-C base: 0x0208d000 size: 8192
 [52] flags: RW-C base: 0x02090000 size: 1048576
 [53] flags: RW-C base: 0x02191000 size: 8192
 [54] flags: RW-C base: 0x02194000 size: 1048576
 [55] flags: RW-C base: 0x02295000 size: 8192
 [56] flags: RW-C base: 0x02298000 size: 1048576
 [57] flags: RW-C base: 0x02399000 size: 8192
 [58] flags: RW-C base: 0x0239c000 size: 1048576
 [59] flags: R-XC base: 0x70001000 size: 135168
 [60] flags: RW-C base: 0x70022000 size: 4096
 [61] flags: RW-C base: 0x70023000 size: 4096
 [62] flags: RW-C base: 0x70025000 size: 1048576
 [63] flags: RW-C base: 0x70126000 size: 1048576
 [64] flags: RW-C base: 0x7ff00000 size: 1048576

Prehaps the most noticeable impact is that dumping core takes a loong time because the core file becomes huge.

Attachments (1)

Screenshot from 2014-07-11 15:49:24.png (23.7 KB) - added by Jakub Jermář 5 years ago.
Screenshot illustrating how fibrils in TCP get created per one index.html request. 0xc0a0 is the timer func fibril. 0x4f20 is sock recv fibril.

Download all attachments as: .zip

Change History (7)

comment:1 Changed 5 years ago by Jiri Svoboda

Description: modified (diff)

comment:2 Changed 5 years ago by Jakub Jermář

I think there is a problem with fibril_timer_func fibrils.

When TCP starts, it creates a connection_fibril, a tcp_sock_recv_fibril and 6x fibril_timer_func.

On the initial HTTP request to index.html, I observed 9 fibrils created (6x fibril_timer_func, 3x tcp_sock_recv_fibril) but only 7 destroyed (4x fibril_timer_func, 3x tcp_sock_recv_fibril), so 2 fibril_timer_func fibril did not get destroyed (or rather their stacks released).

On the consequent requests, only the number of created fibril_timer_func fibrils dropped to 4, but only 2 of these were destroyed. So again, 2 of them leaked.

As a conclusion, it looks like two fibril_timer_func fibrils leak per one index.html request (which itself can be recursive?). See the attached screenshot, which illustrates this.

Changed 5 years ago by Jakub Jermář

Screenshot illustrating how fibrils in TCP get created per one index.html request. 0xc0a0 is the timer func fibril. 0x4f20 is sock recv fibril.

comment:3 Changed 5 years ago by Jiri Svoboda

Yes, they are definitely leaking, I verified that by adding a list of all fibrils and inspecting it in core dump. Question is why. Maybe the fibril timers are not being destroyed properly. That explains the 8kB-sized areas. It does not, however, explain the 1MB-sized areas. They seem to be used for allocating heap blocks (I am getting pointers with high adresses) - so that would be a bug in malloc.

comment:4 Changed 5 years ago by Jiri Svoboda

Note, there is a potential problem where a timer fibril can potentially deadlock trying to destroy its owning timer that I've yet to commit a fix for - but not sure if it manifests here.

Last edited 5 years ago by Jiri Svoboda (previous) (diff)

comment:5 Changed 5 years ago by Jakub Jermář

Well, fibril_timer_func is created with the default fibril stack size, which is 1M, so that explains the 1M items.

comment:6 Changed 5 years ago by Jiri Svoboda

Milestone: 0.6.0
Resolution: fixed
Status: newclosed

This was introduced in mainline,2129 and fixed in 2136. I believe it was the problem with deadlock when trying to destroy timer from its own timer handler function.

Note: See TracTickets for help on using tickets.