Opened 3 years ago
Closed 2 years ago
#840 closed defect (fixed)
Network link not coming up in VirtualBox (Intel Pro/1000)
Reported by: | Jiri Svoboda | Owned by: | |
---|---|---|---|
Priority: | major | Milestone: | 0.12.1 |
Component: | helenos/unspecified | Version: | mainline |
Keywords: | virtualbox | Cc: | |
Blocker for: | Depends on: | ||
See also: |
Description
When running HelenOS in VirtualBox, HelenOS DHCP client fails to obtain an address (dnscfg prints "Nameserver: none") and link state is shown as down.
VirtualBox can emulate two different PCnet adapters (not supported by HelenOS) and VirtIO-net (not supported by HelenOS). It can also emulate three different Intel Pro/1000. If I use one of the two MT models, I get symptoms as above. For the other, T model, I don't get a NIC instance.
It seems currently there is no VirtualBox configuration in which we could get networking to work. Also note that the default network adapter for OS Other - Other/Unknown is PCnet-FAST III, which we don't support at all.
Change History (18)
comment:1 by , 3 years ago
Keywords: | virtualbox added |
---|
comment:2 by , 3 years ago
Milestone: | → 0.12.1 |
---|
comment:3 by , 3 years ago
comment:4 by , 3 years ago
I tried Haiku in VirtualBox 6.1, with NAT networking. everything works with any of the VirtualBox-provided network adapters (both PCnet, all three Intel, Virtio-net). In my case, the VM received IP address 10.0.
Looking at Haiku's /var/log/syslog, the DHCP daemon succeeded to receive address (10.0.2.15), subnet (255.255.255.0), GW (10.0.2.2), nameserver (10.0.2.3) from DHCP server at 10.0.2.2.
comment:5 by , 3 years ago
If I manually configure an IPv4 address in HelenOS (with Intel 82540EM) 10.0.2.15, I can successfully ping 10.0.2.2 and 10.0.2.3(!)
This means the NIC is transmitting/receiving frames, IP works. It thus seems just DHCP (or possibly broadcast) is not working.
comment:6 by , 3 years ago
As noted above I can ping 10.0.2.3 (which should be the DNS nameserver address), but if I manually configure it (dnscfg set-ns 10.0.2.3), and try to resolve an address, it waits for a bit, then times out with an error.
comment:7 by , 3 years ago
If I manually configure a default route
/ # inet create-sr 0.0.0.0/0 10.0.2.2 default
then ping helenos.org:
/ # ping 82.208.58.129
it works! Looks like ICMP works just fine, but we might be perhaps having trouble with UDP and/or TCP.
comment:8 by , 3 years ago
When I use IP address of helenos.org and use the download tool:
/ # download http://82.208.58.129/ Server returned status 403 Forbidden / #
It worked! The server returned an error because we didn't supply the correct Host field in the HTTP request.
This means TCP works. That narrows down the problem to just UDP (more likely) or DNS+DHCP (less likely).
comment:9 by , 3 years ago
After fixing a problem in netecho (not being able to send any messages), I tried the following:
On the Linux host I started ncat -l -u 1234
(listen on UDP port 1234). Then tried sending messages from HelenOS in VirtualBox using # netecho -d <host-address>:1234
and it worked!
That means UDP works. Just DNS and DHCP do not.
comment:10 by , 3 years ago
I removed nconfsrv from init, that means we can intervene before starting DHCP negotiation. Then we can do
# logset dhcp debug2 # /srv/net/nconfsrv
Looking at the log we can see:
- We send DHCPDISCOVER
- We receive offer (address 10.0.2.15/24, router 10.0.2.2, DNS 127.0.0.53, …)
- We send DHCPREQUEST
- We time out waiting for the answer
comment:11 by , 3 years ago
I enabled logging in udp and inetsrv and verified that the DHCPACK is not seen on UDP or IP layer in HelenOS, meaning the DHCP server probably did not respond to our DHCPREQUEST (as opposed to it responding but us dropping the message).
So we need to figure out why the DHCP server in VirtualBox is okay with our DHCPDISCOVER but does not like our DHCPREQUEST.
comment:12 by , 3 years ago
The problem is HelenOS DHCP client was setting ciaddr in the DHCP request header. ciaddr is to be filled in with the current IP address when renewing it. When requesting an address for the first time, we are in SELECTING state and RFC 2131 states in section 4.3.2:
o DHCPREQUEST generated during SELECTING state: Client inserts the address of the selected server in 'server identifier', 'ciaddr' MUST be zero, 'requested IP address' MUST be filled in with the yiaddr value from the chosen DHCPOFFER.
It seems Qemu's DHCP server is tolerant here, but VirtualBox's is not.
Fixed this in changeset af259da6cd1876ab810c671932715fd43fabdc48.
Now we succesffully get IP address, subnet mask, DNS server and default gateway from DHCP:
- Address 10.0.2.15/24
- Router 10.0.2.2
- DNS server 127.0.0.53
Still DNS does not work.
The address 127.0.0.53 looks very suspicious. Considering that 53 is the code for DHCP option 'DHCP message type' I have a hunch that we did not parse the options in the DHCPOFFER/DHCPACK correctly.
If I manually set the DNS server address to 10.0.2.3, DNS still does not work. So I guess we have yet another problem there.
comment:13 by , 3 years ago
Sending DNS queries to 127.0.0.53 isn't working because based on loopback network link address 127.0.0.1/24 it gets sent down loopback link, comes back and then gets dropped (because we don't have address 127.0.0.53)….
When I manually set dns server address to 10.0.2.3 and enable debug2 on inetsrv I can see that the DNS requests are being sent, but nothing comes back.
I dumped the DNS requests and they are byte-for-byte same as those sent by 'getent hosts xxx' in Linux.
Looking at my host's /etc/resolv.conf I can see where the 127.0.0.53 came from. 'nameserver 127.0.0.53' this looks like systemd's local name server.
Now I don't understand why VirtualBox passed this address to HelenOS via DHCP. If I run Haiku in VirtualBox in practically the same configuration, it displays 10.0.2.3 as the DNS server address (and works correctly).
comment:14 by , 3 years ago
If I edit my host's /etc/resolv.conf and put in 'nameserver 192.168.0.1' - the IP address of my Wireless router, I can still resolve host names from Linux. In HelenOS/VirtualBox I now get 192.168.0.1 as the DNS server address via DHCP(!). I can ping this address. At first it seemed like DNS requests still did not work, but then it started to work. Now I can resolve host names from within HelenOS/VB, where nameserver in HelenOS is configured to 192.168.0.1(!) This shows UDP/ bidirectional NAT works as expected.
It would be easy to blame this strange behavior on systemd+VirtualBox, but it does not explain why it only happens with HelenOS, but not with Haiku (Linux, etc). Why Haiku gets DNS address 10.0.2.3 and HelenOS gets a different one?
comment:15 by , 2 years ago
I dumped the DHCP server responses and the incriminating address 127.0.0.53 appears both in DHCPOFFER and DHCPACK. Thus the problem occurs as soon as we send DHCPDISCOVER.
I was looking at the differences between a DHCP discover messages sent by HelenOS vs. Linux. I couldn't find anything obviously wrong with HelenOS messages, but there are some differences.
Here's a Wireshark dump of Linux discover message:
Dynamic Host Configuration Protocol (Discover) Message type: Boot Request (1) Hardware type: Ethernet (0x01) Hardware address length: 6 Hops: 0 Transaction ID: 0xa55e2e70 Seconds elapsed: 2 Bootp flags: 0x0000 (Unicast) Client IP address: 0.0.0.0 Your (client) IP address: 0.0.0.0 Next server IP address: 0.0.0.0 Relay agent IP address: 0.0.0.0 Client MAC address: LCFCHeFe_b8:e0:2d (50:7b:9d:b8:e0:2d) Client hardware address padding: 00000000000000000000 Server host name not given Boot file name not given Magic cookie: DHCP Option: (53) DHCP Message Type (Discover) Option: (61) Client identifier Length: 7 Hardware type: Ethernet (0x01) Client MAC address: LCFCHeFe_b8:e0:2d (50:7b:9d:b8:e0:2d) Option: (55) Parameter Request List Option: (57) Maximum DHCP Message Size Option: (50) Requested IP Address (10.163.47.226) Option: (12) Host Name Option: (255) End
Differences with HelenOS:
- Transaction ID is not random (it is always 42)
- Boot flags: 0x8000 (broadcast)
- Seconds elapsed is always 0
- The only option used is 53 (message type)
I modified the transaction ID to use a random number generator and also hacked inetsrv to accept all packets regardless of target IP address so that I could try boot flags 0 / unicast, nothing had any effect on the problem. It seems this is not it.
So currently it's looking like what's triggering the problem could be the absence of one of the DHCP options:
- client identifier (61)
- parameter request list (55)
- maximum dhcp message size (57)
- requested IP address
- host name
comment:16 by , 2 years ago
Okay, here's the root cause.
If parameter request list is not provided, or it is provided and Domain Name Server (6) is not listed, then the problem occurs and VirtualBox provides the incorrect DNS server address.
This is VirtualBox 6.1.
comment:17 by , 2 years ago
I filed a bug with VirtualBox.
Apart from that, HelenOS should clearly request DNS server address if it wants to use it, so it's an easy fix.
comment:18 by , 2 years ago
Resolution: | → fixed |
---|---|
Status: | new → closed |
Fixed in changeset b7155d7afd4c423375b9108bfa861575b8eb6a04.
I tested this again with latest HelenOS and VirtualBox 6.1. Not sure why, I am getting a link 'up' for both MT adapters (was it really down before?), but DHCP times out trying to obtain an address / DNS server. Here's the detailed results: