# User Documentation (Getting Started) ## Installation From PyPI ```bash pip install space_packet_parser ``` From Anaconda ```bash conda install -c lasp space_packet_parser ``` ## Basic Usage The typical workflow for parsing packets is to 1. Load a packet definition Packet definitions are the XTCE configuration documents that describe how to parse and extract binary chunks of data into Python variables. ```python definition = spp.load_xtce("/path/to/xtce_definition.xml") ``` 2. Iterate over binary data You can load binary data from a file all at once, or continually read from a socket stream. To parse individual packets, you can iterate over that binary data to yield individual binary packet chunks one at a time. There is a built-in generator for CCSDS packets. Other binary packet generators can be used as well if your packets follow a different protocol from CCSDS. ```python for binary_packet in spp.ccsds_generator("/path/to/packet_file.ccsds"): # Print out each packet's header print(binary_packet) ``` 3. Parse the binary packet data into a dictionary of parsed items With a definition (1) and a stream of individual packets (2), one can then parse the contents of that binary data into Python objects. The packet definition defines a lookup structure based on `Parameter` names, which are returned as a python dictionary of `{ParameterName: value}` items. ```python packet = definition.parse_bytes(binary_packet) # All items within the packet print(packet) # An individual item print(packet["my_uint3_param"]) ``` Putting this all together in an example script: ```python from pathlib import Path import space_packet_parser as spp from space_packet_parser import ccsds packet_file = Path('my_packets.pkts') xtce_document = Path('my_xtce_document.xml') # 1) load the XTCE packet_definition = spp.load_xtce(xtce_document) # 2) create a binary generator to yield packet binary data generator = spp.ccsds_generator(packet_file.open('rb')) # 3) parse individual packets with the definition packets = [packet_definition.parse_bytes(ccsds_packet) for ccsds_packet in generator] # You can introspect the packet definition to learn about what was parsed # Look up a type (includes unit and encoding info) pt = packet_definition.parameter_types["MY_PARAM_Type"] # Look up a parameter (includes short and long descriptions) p = packet_definition.parameters['MY_PARAM'] # Look up a sequence container (includes inheritance) sc = packet_definition.containers['SecondaryHeaderContainer'] # See the API docs for more information about the ParameterType, Parameter, and SequenceContainer classes with packet_file.open("rb") as binary_data: ccsds_generator = ccsds.ccsds_generator(binary_data) for packet_bytes in ccsds_generator: packet = packet_definition.parse_bytes(packet_bytes) # Do something with the packet data, which behaves like a dict print(packet['PKT_APID']) print(packet.header) # subset of packet print(packet.user_data) # subset of packet ``` We aim to provide examples of usage patterns. Please see the `examples` directory in the GitHub repo. If there is a specific example you want to see demonstrated, please open a GitHub Issue or Discussion for support. ## Parsing Packets to Xarray Datasets For analysis and visualization workflows, Space Packet Parser can parse packets directly into Xarray Datasets using the `create_dataset` function. This is particularly useful when working with timeseries telemetry data. Note that this requires installing the optional `xarray` dependencies: ```bash pip install space_packet_parser[xarray] ``` The `create_dataset` function returns a dictionary of Datasets keyed by APID, where each Dataset contains all parameters from packets with that APID: ```python from pathlib import Path import space_packet_parser as spp from space_packet_parser.xarr import create_dataset packet_file = Path('my_packets.pkts') xtce_definition_file = Path('my_xtce_document.xml') # Parse packets directly to Xarray Datasets (one per APID) datasets = create_dataset( packet_files=[packet_file], xtce_packet_definition=xtce_definition_file ) # Access dataset for a specific APID apid_1_data = datasets[1] print(apid_1_data) # Work with the data print(apid_1_data['MY_PARAMETER'].values) ``` You can filter packets by APID or other criteria by passing a `packet_filter` function. This is useful when working with multiplexed packet streams: ```python # Filter to only parse packets with APID 41 datasets = create_dataset( packet_files=[packet_file], xtce_packet_definition=xtce_definition_file, packet_filter=lambda pkt: pkt.apid == 41 ) ``` **Limitations**: The `create_dataset` function only supports packet definitions with consistent field structure across all packets with the same APID. It cannot handle polymorphic packets where the structure changes based on previously parsed values. For such cases, use the low-level parsing API by calling the `parse_bytes()` method directly. ## Packet Bytes Generators Packet bytes generators are functions that yield individual packets as `bytes` objects (or subclasses of `bytes`) from a binary data source. Space Packet Parser provides built-in generators like `ccsds_generator`, `fixed_length_generator`, and `udp_generator`, but users can write custom generators to parse any packet format they need. A generator function should accept a binary data source (file-like object, socket, or bytes) and yield packet bytes one at a time. The built-in generator implementations in `space_packet_parser/generators/` provide complete examples of how to implement packet bytes generators. Custom generators allow you to adapt Space Packet Parser to work with any binary packet format. While XTCE is commonly used with CCSDS packets, the XTCE standard is not limited to representing CCSDS packet structures. The CCSDS header information (VERSION, TYPE, APID, etc.) is not required by XTCE. You can define XTCE packet structures for any binary format and use a custom or built-in generator to yield those packets for parsing. ### Built-in Generators #### CCSDS Generator The `ccsds_generator` parses CCSDS Space Packets according to the CCSDS standard. It uses the packet length field in the CCSDS header to determine packet boundaries and supports features like segmented packet reassembly. ```python from space_packet_parser import ccsds_generator, load_xtce packet_definition = load_xtce("my_ccsds_packets.xml") for packet_bytes in ccsds_generator(binary_data): parsed = packet_definition.parse_bytes(packet_bytes) print(parsed) ``` #### Fixed Length Generator The `fixed_length_generator` yields fixed-size chunks from binary data. This is useful for packet formats where all packets have a known, constant length. ```python from space_packet_parser import load_xtce from space_packet_parser.generators import fixed_length_generator packet_definition = load_xtce("my_fixed_length_packets.xml") for packet_bytes in fixed_length_generator(binary_data, packet_length_bytes=64): parsed = packet_definition.parse_bytes(packet_bytes) print(parsed) ``` #### UDP Generator The `udp_generator` parses UDP (User Datagram Protocol) packets from binary data. It reads the UDP length field from each packet header to determine packet boundaries. The generator yields `UDPPacketBytes` objects that expose UDP header fields (source port, destination port, length, checksum) as properties. ```python from space_packet_parser import udp_generator, load_xtce packet_definition = load_xtce("my_udp_packets.xml") for udp_packet in udp_generator(binary_data): # Access UDP header fields directly print(f"From port {udp_packet.source_port} to port {udp_packet.dest_port}") # Parse the packet using XTCE parsed = packet_definition.parse_bytes(udp_packet) print(parsed) ``` ### Writing Custom Generators A minimal custom generator follows this pattern: ```python def custom_generator(binary_data): """Yields fixed-length packets from binary data.""" while True: packet_bytes = binary_data.read(packet_length) if not packet_bytes: break yield packet_bytes ``` For more sophisticated generators that handle multiple input types (files, sockets, bytes) and provide progress tracking, see the implementations of the built-in generators in `space_packet_parser/generators/`. These demonstrate best practices like using `_setup_binary_reader` utility for handling different data sources and optional progress bars. ### Filtering Packets For generators that expose packet metadata (like `CCSDSPacketBytes` with its `apid` property), you can filter packets before parsing to improve performance. A code example of this is linked in [examples](examples.md). ## Error Handling and Debugging When parsing packets, you may encounter situations where packets cannot be parsed successfully. The low-level API provides direct control over how to handle these cases. ### Handling UnrecognizedPacketTypeError If a packet doesn't match any of the defined packet structures in your XTCE definition, an `UnrecognizedPacketTypeError` will be raised. You can catch this error to examine the partially parsed packet data for debugging: ```python from space_packet_parser import ccsds from space_packet_parser.exceptions import UnrecognizedPacketTypeError with packet_file.open("rb") as binary_data: ccsds_generator = ccsds.ccsds_generator(binary_data) for packet_bytes in ccsds_generator: try: packet = packet_definition.parse_bytes(packet_bytes) # Process successful packet print(f"Successfully parsed packet with APID: {packet.binary_data.apid}") except UnrecognizedPacketTypeError as e: # Handle unrecognized packet print(f"Unrecognized packet type") print(f"Partial data: {e.partial_data}") # Contains any successfully parsed fields # Continue processing other packets or handle the error as needed ``` ## Packet Objects The object returned from `parse_bytes()` is a `SpacePacket`. This object subclasses a python dictionary and behaves as a dictionary. To retrieve a parameter value from the parsed packet, you can iterate over its `items()` or you can access individual parameters by name. ```python from space_packet_parser import ccsds ccsds_generator = ccsds.ccsds_generator(data) packet_bytes = next(ccsds_generator) packet = packet_definition.parse_bytes(packet_bytes) my_param = packet["MY_PARAM_NAME"] all_param_names = list(packet.keys()) ``` ## Parameter Objects The parameter values within the packet are subclasses of normal python data types: `int`, `float`, `str`, `bool` and `bytes`. The objects behave exactly as the python data types except that they all contain a `raw_value` attribute, which contains the value generated by the data encoding parser, before being passed through any calibrators, enum lookups, string parsing, or boolean evaluation. ```python print(my_param) # prints the most derived value available - str, int, float, bytes, or bool print(my_param.raw_value) # prints the "raw" encoded value parsed by the low level data encoding ``` Space Packet Parser returns the following types for parameters within a packet. They behave just as their Python base classes (`int`, `float`, `str`, `bytes`, and `bool` respectively) except that each contains a `raw_value` attribute that contains the encoded value before applying any calibration or other derived processing of the value. The primary value of each parameter type is the fully parsed (calibrated, enumerated, string-parsed, etc.) value. - `IntParameter` - `FloatParameter` - `StrParameter` - `BinaryParameter` - `BoolParameter` ### Numeric Calibration Int and float parameters can be calibrated on the fly during decoding. These calibrators are defined on the data encoding XTCE element and can transform the raw encoded value to a calibrated value, e.g. via a polynomial. Calibrated values are always floats, even if the raw encoded value is an integer. For example, ```xml ``` in this encoding definition, the raw encoded value is a 16bit unsigned integer that is calibrated by a polynomial to produce a calibrated value, which is always a float. In this case `value = .012155 * raw_value + 2.54`. ### String Parsing Strings are encoded as a buffer of determined size (either fixed length or dynamic based on previous parameter). The raw buffer includes any additional string data such as a leading size integer or a termination character. If a leading size or termination character is specified in the XTCE definition, the parsed string value is returned as the value of the parameter and the buffer is returned as the `raw_value`. If no termination character or leading size is specified, the value and `raw_value` are the same and both refer to the raw string buffer. For example, ```xml ``` in this encoding definition, the size of the raw string buffer (number of bytes in the packet) is defined by a parameter named `STR_SIZE`. The value stored in `STR_SIZE` is given in number of bytes so it is multiplied by 8 and a constant base length of 27 bits is added to the final buffer size. So if `STR_SIZE` encodes the value 4, the raw string buffer width in the packet is 59bits. This is an odd size for a string because it is not an integer number of bytes but that's because it includes a 3bit unsigned int in front of the string data that specifies the size of the string, in bits, making the raw string `[3 bit uint | 7 bytes]` In this case, the `raw_value` of the parameter will contain the full string buffer as an 8 byte string, padded on the RHS with 5 zero bits. We have to pad it because you cannot create a byte string from a non-integer number of bytes (59bits). The `value` of the parameter will contain the fully parsed `str` object based on the value of the leading size. If the leading size uint3 represents the integer 4, the `value` of the parameter will be a `str` that is made of the first 4 bytes of data in the raw buffer following the leading size. Termination characters work similarly. ```xml 32 0058 ``` In this case, the raw buffer is a fixed length (32bits). The parsed `StrParameter.raw_value` will be the full string buffer, including the termination character and any additional following bytes. The `value` of the parameter will be a `str` based on all the encoded bytes preceding the termination character. In this case, the raw string buffer _will_ always be an integer number of bytes since a termination character is always an integer number of bytes, so no padding of the raw value is required. ### Enumerated Lookups Enums are defined by lookup tables in the XTCE, which are converted to dictionaries internally. Once the raw value from the data encoding is parsed, a lookup is made to the lookup table and the final string label is returned. Note that the final label from enumerated lookups is always a string. The raw value used in the lookup table is interpreted based on the data encoding for the parameter. Integer encoded enum values are ints, float encoded values are floats, and string encoded values are used as the raw string buffer from the encoding. Only raw values may be used for enum lookups. Calibrated numeric values cannot be used for enum lookups from numeric encodings. For string encoded parameters, only raw string buffers may be used for lookups (not fully parsed strings). For example, ```xml ``` the encoded value (`raw_value`) is a uint8 integer but the value returned for an enumerated parameter type will be a `StrParameter` containing the label string associated with the integer value. ### Boolean Evaluation Booleans behave nicely for integers and floats where zero is False and everything else is True. For string and binary encoded values, the only falsy value is an empty string, which is kind of silly to encode. XTCE is not specific on the interpretation of string and binary encoded values for boolean parameters and there is no generally accepted interpretation, so we default to Python's `bool`, which interprets any non-empty string as True. Only raw values may be used for boolean evaluation. Calibrated values are not considered. For example, ```xml m/s ``` the encoded value (`raw_value`) is a single bit interpreted as an integer but the value returned for a boolean parameter type will be a `BoolParameter`, evaluated over the encoded integer value. `False` if the integer is 0, `True` otherwise. ## Parsing from a Socket The input data object to `XtcePacketDefinition.packet_generator` need only be a binary filelike object from which bytes can be read. This means the packet generator is not limited to parsing data from files! In an effort to support development of quicklook type tools, we provide an example of parsing data streaming through a socket in `parsing_and_plotting_idex_waveforms_from_socket.py`. The example mocks the behavior of an instrument sending packet data asynchronously through a socket in chunks of inconsistent size. The packet parser reads bytes from the receiver side of the socket and will read data repeatedly until there is sufficient data for the full packet. Once it has a full packet (as determined by the packet length in the CCSDS header), it cranks the generator and yields a parsed packet. You'll notice that the example ends with a timeout error. This timeout can be controlled when creating the socket connection with `receiver.settimeout(timeout_seconds)`. ## Variable Length Packet Fields of Explicit Length Flight software engineers often need to downlink data (usually binary blobs) of variable length. The length of these fields is often specified in a _previous_ telemetry point in the same packet, and you have to fetch the length by referencing that previous field. ### Explicit Variable Length Example Suppose the variable length field is called `SCI_DATA` and is a binary blob (e.g. of compressed data). The length of this field is specified earlier in the packet in a field called `SCI_DATA_BYTELEN`, specified in number of bytes. To define the type for `SCI_DATA` in XTCE, you could use the following (snippet): ```xml ``` This tells the parser that the size in bits of data type `SCI_DATA_Type` (the type of `SCI_DATA`) is the raw value encoded in the parameter `SCI_DATA_BYTELEN`, multiplied by 8 (to convert number of bytes to number of bits). ## Variable Length Packet Fields of Implicit Length In some circumstances, flight software teams define a packet field that simply fills up the "remaining space" in the packet. The length of this field is usually implicit but can be computed by subtracting the combined length of all fixed length fields in the packet from the total length of the packet specified in the CCSDS header. The `PKT_LEN` field is the length of the packet user data, in bytes. This field: - counts from zero - does not include the header data (always 6 bytes) Thus, you can determine the length of your field dynamically from the packet length in the CCSDS header: $$len_{var} = 8 \times (len_{packet} + 1) - \sum_n len_{fixed,n}$$ where - $len_{var}$ is the length, in bits, of the variable length field - $len_{packet}$ is the packet user data length in bytes (from the CCSDS header) - $\sum_n len_{fixed,n}$ is the combined length of all fixed length fields in the packet user data There are some limitations to this. If your FSW team is violating these limitations, they are making your life extremely difficult, and you have my condolences. - You can only have a _single_ "remaining packet length" field in a given packet definition. Encoding more than one such field makes it impossible to determine the length of the fields. - All other fields in the packet _must_ be fixed length. There is no way that I know of in XTCE to calculate a dynamic length that is an arbitrary function of multiple previous length specifier fields. ### Implicit Variable Length Example Packet Definition: ```text "VERSION" : 3 bits "TYPE" : 1 bits "SEC_HDR_FLG" : 1 bits "PKT_APID" : 11 bits "SEQ_FLGS" : 2 bits "SRC_SEQ_CTR" : 14 bits "PKT_LEN" : 16 bits "SHCOARSE" : 32 bits "SID" : 8 bits "SPIN" : 8 bits "ABORTFLAG" : 1 bits "STARTDELAY" : 15 bits "COUNT" : 8 bits "EVENTDATA": variable length ``` To calculate the length of `EVENTDATA`: ```{math} len_{var} &= 8 \times (len_{packet} + 1) - (&&len_{SHCOARSE} + len_{SID} + len_{SPIN} + \\ & &&len_{ABORTFLAG} + len_{STARTDELAY} + len_{COUNT})\\ &= 8 \times (len_{packet} + 1) - (&&32 + 8 + 8 + 1 + 15 + 8)\\ &= 8 \times len_{packet} - 64 && ``` This equation can be implemented in XTCE by referencing the packet length field as follows: ```xml ``` ## XTCE Document Validation Space Packet Parser provides comprehensive validation capabilities for XTCE documents to help ensure they are correct and will work properly for parsing packets. The validation system operates in three modes: "schema", "structure", and a default mode of "all" (both schema and structure validation). - **Schema Validation**: Validates the XML document against the in-document referenced XTCE XSD schema - **Structural Validation**: Validates XTCE-specific structure and reference integrity Schema validation requires correct namespacing declarations at the top of your XTCE document. e.g. ```xml ``` ### CLI Validation ```shell spp --log-level=DEBUG validate my_xtce.xml --local-schema my_xsd.xml --level all ``` ### Programmatic Validation ```python from space_packet_parser import validate_xtce # Validate an XTCE file against the referenced schema result = validate_xtce("my_xtce.xml", level="schema") if result.errors: for error in result.errors: print(f"Error: {error}") else: print("Document is valid") # Validate an XTCE document structure to check for # unused Parameters ParameterTypes and nonexistent references result = validate_xtce("my_xtce.xml", level="structure") if result.errors: for error in result.errors: print(f"Error: {error}") else: print("Document is valid") # Comprehensive validation (both schema and structure) result = validate_xtce("my_xtce.xml", level="all") print(f"Validation completed in {result.validation_time_ms:.1f}ms") if result.errors: for error in result.errors: print(f"Error: {error}") else: print("Document is valid") ``` ## Troubleshooting Packet Parsing Parsing binary packets is error-prone and getting the XTCE definition correct can be a challenge at first. Most flight software teams can export XTCE from their command and telemetry database but these exports usually require some fine-tuning. `UnrecognizedPacketError`s are raised during parsing of an individual packet when either: - a) multiple child containers are valid inheritors of the current sequence container based on restriction criteria evaluated against the data parsed so far - b) no child containers are valid inheritors of the current sequence container based on restriction criteria evaluated against the data parsed so far and the current container is abstract To aid you during development, `UnrecognizedPacketError` exceptions generated during parsing can be returned alongside any valid packet objects by setting `yield_unrecognized_packet_errors=True`. These exception objects are not raised so that the generator may keep parsing. Instead, they are yielded from the generator with a `partial_data` attribute for user examination. This partial data allows you to see how far it got through a container inheritance structure before failing to determine the rest of the structure. ## Common Issues and Solutions ### Parser Generator Completes without Yielding a Packet This can occur if your data file contains only packets that do not match any packet definitions in your XTCE document and `yield_unrecognized_packet_errors=False` (the default). This could mean that your data file actually contains only APIDs that are not covered in your packet definition, but usually it means you have incorrectly defined restriction criteria for SequenceContainer inheritance. For example a restriction criteria element that requires an APID which does not exist in the data. ```xml ``` ### Only Packet Headers are Parsed If you observe that only packet headers are being parsed but no exceptions are being raised (you may be seeing a lot of length mismatch warnings if you have logging set up), it likely means that you have forgotten to set `abstract="true"` on your non-concrete sequence container elements. For example ```xml Super-container for telemetry and command packets ``` will parse as a complete packet, containing only `VERSION` and `TYPE` instead of searching for inheriting sequence containers. To define the container as abstract, change the first element opening tag to ```xml ...contents ``` ## Optimizing for Performance The logic evaluated during packet parsing is largely reflective of the XTCE configuration being used to define packet structures. The more logic in the XTCE, the more logic must be evaluated during parsing. Below are some common ways to reduce complexity and speed up parsing: 1. **Remove `RestrictionCriteria` Elements:** If your packet stream is a single packet structure, there is no reason to require the evaluation of a restriction criteria for each packet. 2. **Remove Unnecessary Packet Definitions:** Even in a packet stream with multiple packet formats, if you only care about one packet type, you can remove the definitions for the other. By default, the packet `Parser` will catch unrecognized packet errors and skip to the next packet. This skips the parsing of packets for which a valid definition cannot be determined. 3. **Reduce Container Inheritance:** A flat container definition structure will evaluate restriction criteria faster than a nested structure. Each instance of nesting requires an additional `MatchCriteria.evaluate()` call for each packet being parsed. 4. **Reduce Complex Items:** Parameter type definitions that contain calibrators or complex string parsing (especially variable length termination character defined strings) add significant evaluation logic to the parsing of each parameter, as does any parameter type that is variable length. Removing them can speed up parsing.