I am finishing up my initial work on writing my own version of the Loki program. Have learned a bit about network programming, specifically dealing with sockets. Thought I would record some of that knowledge here for future reference.
Start out with the network protocols. There is UDP and there is TCP.
UDP is simple. It has less overhead. There is no guarantee that the data gets delivered. Also the order of data sent may not be kept. If your data is not delivered, it does not get automatically retransmitted by UDP. There is no concept of a connection with UDP. You just blast data to its destination. The recipient of the data has to take all of the data at once. Popular other protocols that employ UDP are DNS and SNMP.
TCP is a reliable transport protocol. If some data does not make it to the destination, TCP will handle retransmission. It will also make sure that the order of the data you send is preserved on the other end. You can think of TCP as a byte stream. The destination at the other end can read the bytes in any chunks they want. They don't have to read the message all at once.
When doing UDP and TCP network programming, you will be using sockets. Socket programming is old. It has been around since before the times of the World Wide Web. One popular old library is Berkeley Sockets, which have been around since 1982. There is also Windows sockets (AKA WinSock), that have been around since 1993. These sockets are point to point and are bidirectional.
With sockets, there is a client and a server side. The calls are somewhat similar for UDP and TCP. Generally you have to do less with UDP as it is connectionless. From the client, with UDP you call socket(), sendto() and/or recvfrom(), and finally close. On the server, with UDP you call socket(), optionally setsockopt(), bind(), sendto() and/or recvfrom(), and close().
For TCP, the client calls socket(), connect(), send() and/or recv(), and close(). On the server, TCP calls include socket(), optionally setsockopt(), bind(), listen(), accept(), send() and/or recv(), and close().
The socket() call returns a file descriptor. The bind call associates the socket with an IP address and port number. The listen() call will cause the server to wait for connections from clients. Then accept() gets the first connection, creating a new socket that is connected on both ends. You should have a multithreaded server to deal with multiple connections at the same time.
For data types larger than 8 bits, you need to worry about the order in which those bytes are stored and transmitted. There is a Big Endian and Little Endian convention for this byte order. Big Endian means higher order bytes are at the start. You got to worry about the order on the machines at both ends, as well as the order of the bytes in transit.
Work Smarter not Harder
-
We have large data sets in my current project. Every year tons more data is
loaded into the system. So we only keep the majority of data for 4 years.
After...