Categorygithub.com/hujun-open/etherconn
modulepackage
0.7.0
Repository: https://github.com/hujun-open/etherconn.git
Documentation: pkg.go.dev

# README

etherconn

CI PkgGoDev

Package etherconn is a golang pkg that allow user to send/receive Ethernet payload (like IP pkt) or UDP packet ,with custom Ethernet encapsulation like MAC address, VLAN tags, without creating corresponding interface in OS;

For example, with etherconn, a program could send/recive a UDP or IP packet with a source MAC address and VLAN tags don't exists/provisioned in any of OS interfaces;

Another benefit is since etherconn bypasses "normal" OS kernel routing and IP stack, in scale setup like tens of thousands conns no longer subject to kernel limitation like # of socket/fd limitations, UDP buffer size...etc;

Lastly etherconn.RUDPConn implements the net.PacketConn interface, so it could be easily integrated into existing code;

etherconn supports following types of fowarding engines:

  • RawSocketRelay: uses AF_PACKET socket, linux only
  • XDPRelay: uses xdp socket, linux only
  • RawSocketRelayPcap: uses libpcap, windows and linux

XDPRelay could achieve higher performance than RawSocketRelay, specially in multi-queue, multi-core enviroment.

Performance

Tested in a KVM VM with 8 hyperthreading cores, and Intel 82599ES 10GE NIC, achieves 1Mpps with XDPRelay (1000B packet).

What's New

  1. add RawSocketRelayPcap, supports both windows and linux

Dependencies

etherconn require libpcap on linux, npcap on windows.

Usage

interface <---> PacketRelay <----> EtherConn <---> RUDPConn
                            <----> EtherConn <---> RUDPConn
                            <----> EtherConn <---> RUDPConn
  1. Create a PacketRelay instance and bound to an interface.PacketRelay is the "forward engine" that does actual packet sending/receiving for all EtherConn instances registered with it; PacketRelay send/receive Ethernet packet;

  2. Create one EtherConn for each source MAC+VLAN(s)+EtherType(s) combination needed, and register with the PacketRelay instance. EtherConn send/receive Ethernet payload like IP packet;

  3. Create one RUDPConn instance for each UDP endpoint (IP+Port) needed, with a EtherConn. RUDPConn send/receive UDP payload.

  4. RUDPConn and EtherConn is 1:1 mapping, while EtherConn and PacketRelay is N:1 mapping; since EtherConn and RUDPConn is 1:1 mapping, which means EtherConn will forward all received UDP pkts to RUDPConn even when its IP/UDP port is different from RUDPConn's endpoint, and RUDPConn could either only accept correct pkt or accept any UDP packet;

Egress direction:

UDP_payload -> RUDPConn(add UDP&IP header) -> EtherConn(add Ethernet header) -> PacketRelay

Ingress direction:

Ethernet_pkt -> (BPFilter) PacketRelay (parse pkt) --- EtherPayload(e.g IP_pkt) --> EtherConn
Ethernet_pkt -> (BPFilter) PacketRelay (parse pkt) --- UDP_payload --> RUDPConn (option to accept any UDP pkt)

Note: PacketRelay parse pkt for Ethernet payload based on following rules:

  • PacketRelay has default BPFilter set to only allow IPv4/ARP/IPv6 packet
  • If Ethernet pkt doesn't have VLAN tag, dstMAC + EtherType in Ethernet header is used to locate registered EtherConn
  • else, dstMAC + VLANs + EtherType in last VLAN tag is used

SharedEtherConn and SharingRUDPConn

EtherConn and RUDPConn are 1:1 mapping,which means two RUDPConn can't share same MAC+VLAN+EtherType combination;

SharedEtherConn and SharingRUDPConn solve this issue:

                                    L2Endpointkey-1
interface <---> PacketRelay <----> SharedEtherConn <---> SharingRUDPConn (L4Recvkey-1)
                                                   <---> SharingRUDPConn (L4Recvkey-2)
                                                   <---> SharingRUDPConn (L4Recvkey-3)
                                    L2Endpointkey-2
                            <----> SharedEtherConn <---> SharingRUDPConn (L4Recvkey-4)
                                                   <---> SharingRUDPConn (L4Recvkey-5)
                                                   <---> SharingRUDPConn (L4Recvkey-6)

Example:

	// This is an example of using RUDPConn, a DHCPv4 client
	// it also uses "github.com/insomniacslk/dhcp/dhcpv4/nclient4" for dhcpv4 client part

	// create PacketRelay for interface "enp0s10"
	relay, err := etherconn.NewRawSocketRelay(context.Background(), "enp0s10")
	if err != nil {
		log.Fatalf("failed to create PacketRelay,%v", err)
	}
	defer relay.Stop()
	mac, _ := net.ParseMAC("aa:bb:cc:11:22:33")
	vlanLlist := []*etherconn.VLAN{
		&etherconn.VLAN{
			ID:        100,
			EtherType: 0x8100,
		},
	}
	// create EtherConn, with src mac "aa:bb:cc:11:22:33" , VLAN 100 and DefaultEtherTypes,
	// with DOT1Q EtherType 0x8100, the mac/vlan doesn't need to be provisioned in OS
	econn := etherconn.NewEtherConn(mac, relay, etherconn.WithVLANs(vlanLlist))
	// create RUDPConn to use 0.0.0.0 and UDP port 68 as source, with option to accept any UDP packet
	// since DHCP server will send reply to assigned IP address
	rudpconn, err := etherconn.NewRUDPConn("0.0.0.0:68", econn, etherconn.WithAcceptAny(true))
	if err != nil {
		log.Fatalf("failed to create RUDPConn,%v", err)
	}
	// create DHCPv4 client with the RUDPConn
	clnt, err := nclient4.NewWithConn(rudpconn, mac, nclient4.WithDebugLogger())
	if err != nil {
		log.Fatalf("failed to create dhcpv4 client for %v", err)
	}
	// do DORA
	_, _, err = clnt.Request(context.Background())
	if err != nil {
		log.Fatalf("failed to finish DORA,%v", err)
	}

There is a more complicated example in example folder

Limitations:

* linux and windows only
* since etherconn bypassed OS IP stack, it is user's job to provide functions like:
    * routing next-hop lookup
    * IP -> MAC address resolution
* no IP packet fragementation/reassembly support
* using of etherconn requires root privileges on linux

Built-in XDP Kernel Program

etherconn includes a built-in XDP kernel program binary, its source is in etherconnkern

# Packages

No description provided by the author

# Functions

GetIfNameViaDesc returns interface name via its description, this could be used on windows to get the interface name.
GetIFQueueNum use ethtool to get number of combined queue of the interface, return 1 if failed to get the info.
NewChanMap creates a new instance of ChanMap.
NewEtherConn creates a new EtherConn instance, mac is used as part of EtherConn's L2Endpoint; relay is the PacketRelay that EtherConn instance register with; options specifies EtherConnOption(s) to use;.
NewL2EndpointFromMACVLAN creates a new L2Endpoint from mac and vlans; its Etype is set to any.
NewL2EndpointFromMACVLANEtype creates a new L2Endpoint from mac, vlans and etype.
NewL4RecvKeyViaUDPAddr returns a L4RecvKey from a net.UDPAddr.
NewPcapConn creates a new PcapRelay instances for specified ifname.
No description provided by the author
No description provided by the author
NewRawSocketRelay creates a new RawSocketRelay instance, bound to the interface ifname, optionally along with RelayOption functions.
NewRUDPConn creates a new RUDPConn, with specified EtherConn, and, optionally RUDPConnOption(s).
NewSharedEtherConn creates a new SharedEtherConn; mac is the SharedEtherConn's own MAC address; relay is the underlying PacketRelay; ecopts is a list of EtherConnOption that could be used to customized new SharedEtherConnOption, all currently defined EtherConnOption could also be used for SharedEtherConn.
NewSharingRUDPConn creates a new SharingRUDPConn, src is the string represents its UDP Address as format supported by net.ResolveUDPAddr().
NewXDPRelay creates a new instance of XDPRelay, by default, the XDPRelay binds to all queues of the specified interface.
ResolveNexhopMACWithBrodcast is the default resolve function that always return broadcast mac.
SetIfVLANOffloading set the HW VLAN offloading feature on/off for the interface, turning the feautre off is needed when using XDPRelay and can't get expected vlan tags in received packet.
SetPromisc put the interface in Promisc mode.
WithAcceptAny allows RUDPConn to accept any UDP pkts, even it is not destinated to its address.
WithBPFFilter set BPF filter, which is a pcap filter string; if filter is an empty string, then it means no filter; by default, Relay will have a filter only allow traffic with specified EtherType.
WithDebug enable/disable debug log output.
WithDefault will register the EtherConn to be the default EtherConn for received traffic, see PacketRelay.RegisterDefault for details.
WithDefaultReceival creates a default receiving channel, all received pkt doesn't match any explicit EtherConn, will be sent to this channel; using RegisterDefault to get the default receiving channel.
WithEtherTypes specifies a list of Ethernet types that this EtherConn is interested in, the specified Ethernet types is the types of inner payload, the default list is DefaultEtherTypes.
WithMaxEtherFrameSize specifies the max Ethernet frame size the RawSocketRelay could receive.
WithMultiEngine specifies the number of internal send/recv routine, count must >=1, default value is 1.
WithPerClntChanRecvDepth specifies the per Client(EtherConn) receive channel depth, By default, DefaultPerClntRecvChanDepth is used.
WithQueueID specifies a list of interface queue id (start from 0) that the XDPRelay binds to; by default, XDPRelay will use all queues.
WithRecvMulticast allow/disallow EtherConn to receive multicast/broadcast Ethernet traffic.
WithRecvTimeout specifies the receive timeout for RawSocketRelay.
WithResolveNextHopMacFunc specifies a function to resolve a destination IP address to next-hop MAC address; by default, ResolveNexhopMACWithBrodcast is used.
WithSendChanDepth specifies the send channel depth, by default, DefaultSendChanDepth is used.
WithSendingMode set the XDPRelay's sending mode to m.
No description provided by the author
WithVLANs specifies VLAN(s) as part of EtherConn's L2Endpoint.
WithXDPDebug enable/disable debug log output.
WithXDPDefaultReceival creates a default receiving channel, all received pkt doesn't match any explicit EtherConn, will be sent to this channel; using RegisterDefault to get the default receiving channel.
WithXDPEtherTypes specifies a list of EtherType that the relay accepts, if a rcvd packet doesn't have a expected EtherType, then it will be passed to kernel.
WithXDPExtProg loads an external XDP kernel program iso using the built-in one.
WithXDPPerClntRecvChanDepth set the depth in recving channel for each registered.
WithXDPRXPktHandler sets h as the rx packet handler.
WithXDPSendChanDepth set the dep th in sending channel.
WithXDPTXPktHandler sets h as the tx packet handler.
WithXDPUMEMChunkSize specifies the XDP UMEM size, which implicitly set the max packet size could be handled by XDPRelay, must be either 4096 or 2048 (kernel XDP limitation).
WithXDPUMEMNumOfTrunk specifies the number of UMEM trunks, must be power of 2.

# Constants

DefaultMaxEtherFrameSize is the deafult max size of Ethernet frame that PacketRelay could receive from the interface.
DefaultPerClntRecvChanDepth is the defaul value for per registered client(EtherConn)'s receive channel depth.
DefaultRelayRecvTimeout is the default value for PacketReceive receiving timeout.
DefaultSendChanDepth is the default value for PacketRelay send channel depth, e.g.
No description provided by the author
DefaultVLANEtype is the default Ethernet type for vlan tags, used by function GetVLANs().
DefaultXDPChunkSize is the default size for XDP UMEM chunk.
DefaultXDPUMEMNumOfTrunk is the default number of UMEM trunks.
must be 6+2*n.
MaxNumVLAN specifies max number vlan this pkg supports.
NOVLANTAG is the value to represents NO vlan tag in L2EndpointKey.
No description provided by the author
No description provided by the author
No description provided by the author
No description provided by the author
XDPSendingModeBatch is the TX mode sends a batch of packet a time, only use this mode when needed TX pps is high;.
XDPSendingModeSingle is the TX mode send a packet a time, this is the default mode;.

# Variables

BroadCastMAC is the broadcast MAC address.
DefaultEtherTypes is the default list of Ethernet types for RawPacketRelay and EtherConn.
ErrRelayStopped is the error returned when relay already stopped.
ErrTimeOut is the error returned when opeartion timeout.

# Structs

No description provided by the author
ChanMap is an GO routine safe map, key is interfce{}, val is a chan *RelayReceival;.
EtherConn send/recv Ethernet payload like IP packet with customizable Ethernet encapsualtion like MAC and VLANs without provisioning them in OS.
L2Endpoint represents a layer2 endpoint that send/receives Ethernet frame.
No description provided by the author
RawSocketRelay implements PacketRelay interface, using AF_PACKET socket.
RelayPacketStats is the PacketRelay's forwding stats; use atomic.LoadUint64 to read the values.
RelayReceival is the what PacketRelay received and parsed.
RUDPConn implement net.PacketConn interface; it used to send/recv UDP payload, using a underlying EtherConn for pkt forwarding.
SharedEtherConn could be mapped to multiple RUDPConn.
SharingRUDPConn is the UDP connection could share same SharedEtherConn;.
VLAN reprents a VLAN tag.
XDPRelay is a PacketRelay implementation that uses AF_XDP Socket.
XdpSockStats hold per XDP socket/queue stats.

# Interfaces

PacketRelay is a interface for the packet forwarding engine, RawSocketRelay implements this interface;.
No description provided by the author

# Type aliases

EtherConnOption is a function use to provide customized option when creating EtherConn.
L2EndpointKey is key identify a L2EndPoint,first 6 bytes are MAC address, VLAN Ids in order (from outside to inner), each VLAN id are two bytes in network endian, if VLAN id is NOVLANTAG,then it means no such tag; last two bytes are inner most EtherType.
L4RecvKey resprsents a Layer4 recv endpoint: [0:15] bytes is the IP address, [16] is the IP protocol, [17:18] is the port number, in big endian.
No description provided by the author
RelayOption is a function use to provide customized option when creating RawSocketRelay.
No description provided by the author
RUDPConnOption is a function use to provide customized option when creating RUDPConn.
SharedEtherConnOption is the option to customize new SharedEtherConnOption.
SharingRUDPConnOptions is is the option to customize new SharingRUDPConn.
VLANs is a slice of VLAN.
XDPRelayOption could be used in NewXDPRelay to customize XDPRelay upon creation.
XDPSendingMode is the TX mode of XDPRelay.
XDPSocketPktHandler is a handler function could be used for rx/tx packet of a XDP socket.
No description provided by the author