Persistent TCP Data Source
The Persistent TCP module provides high-speed, reliable data synchronization between Mango instances. It includes both a data source (receiver) and a publisher (sender) that work together to transfer point values, data point configurations, and events from one Mango instance to another. This is the primary mechanism for building distributed Mango architectures where edge instances collect data and publish it to a central instance.
The module includes two generations of the technology:
- Legacy Persistent TCP (PTCP): The original implementation using a custom binary protocol over raw TCP.
- gRPC (recommended): The modern replacement that uses the gRPC protocol over TCP with TLS encryption, persistent queuing, and enhanced point synchronization.
For new installations, the gRPC data source and publisher are recommended over the legacy PTCP versions.
Overview
| Property | Value |
|---|---|
| Module | mangoAutomation-PersistentTcp |
| Protocol | Persistent TCP / gRPC |
| Direction | Listening |
| Typical Use | Receiving data from remote Mango instances (edge-to-cloud) |
Prerequisites
- Two or more Mango instances with the Persistent TCP module installed.
- Network connectivity between instances on the configured TCP port.
- For gRPC: the gRPC server must be enabled in
mango.properties(it is enabled by default). - For gRPC with TLS: proper TLS configuration in
mango.propertieson both the client and server sides. - For legacy PTCP: a shared authorization key configured identically on both the publisher and data source.
- Sufficient disk space for the gRPC persistent queue (see queue size estimation below).
Architecture
The Persistent TCP system follows a publisher-subscriber model:
- Publisher (client): Runs on the edge/source Mango instance. Collects point values and sends them to the remote data source. Also synchronizes data point configurations.
- Data source (server): Runs on the central/receiving Mango instance. Listens for connections from publishers and receives point values, creating and updating data points as needed.
Typically, multiple edge Mango instances each run a publisher that connects to a single central Mango instance running a Persistent TCP or gRPC data source. Other topologies are also possible.
gRPC Data Source Configuration (Recommended)
Data Source Settings
| Setting | Description |
|---|---|
| Name | A descriptive name for the data source. |
| Default data source | Makes this the default gRPC data source. Publishers not explicitly linked to another data source are routed here. Only one data source may be set as default. |
| Synchronous writes | When enabled, point values are written to the time-series database synchronously upon receipt. When disabled, values are queued in memory and written asynchronously for higher throughput (at the expense of consistency). |
| Linked publishers | Select one or more publishers to route to this data source. Each publisher may only be linked to one data source. |
Permission Synchronization
The gRPC data source controls how data point permissions are synchronized from the publisher:
| Mode | Behavior |
|---|---|
| Ignore | Permissions from the publisher are ignored. Manual modifications on the data source side are preserved. |
| Copy | Permissions match those sent by the publisher. |
| Merge (OR) | Permissions from the publisher are merged with local overrides using a logical OR operation. |
| Merge (AND) | Permissions are merged using a logical AND (cartesian product of role terms). |
| Replace | Permissions are replaced with the configured overrides. |
You can also configure read permission override, edit permission override, set permission override, and extra tags that are applied to all synchronized data points. Changing these settings triggers a full resynchronization of all data points.
gRPC Publisher Settings
The publisher runs on the source/edge Mango instance.
Connection Settings
| Setting | Description |
|---|---|
| Use TLS | Enables TLS encryption for the connection. Highly recommended. The gRPC server must have matching TLS configuration. |
| Host | The DNS name or IP address of the remote Mango instance running the gRPC data source. |
| Port | The TCP port of the remote gRPC server. |
| Authentication token | A Bearer token sent in the Authorization header. Use a Mango JWT (generated on the user page) or an OAuth 2.0/OIDC token. |
Data Point Settings
| Setting | Description |
|---|---|
| XID prefix | A string prepended to each data point's XID when synchronized. Ensures unique XIDs when multiple publishers with similar point configurations publish to the same central instance. Maximum XID length is 100 characters. |
| Permission sync mode | Controls whether and how permissions are sent to the data source (Don't send, Send, Merge OR, Merge AND, Replace). |
| Extra tags | Tags added to every published data point (e.g., location=site-a). Useful for identifying the source of published data. |
Publishing Settings
| Setting | Description |
|---|---|
| Send events | Enables sending data point events to the remote data source. |
| Send attributes | Enables sending data point attribute updates (e.g., the "unreliable" flag). |
| Queue when disabled | Continues collecting data into the persistent queue even while the publisher is disabled, preventing data gaps. |
| Max entries per request | Maximum number of queue entries per gRPC request. Too high may cause RESOURCE_EXHAUSTED errors; too low reduces throughput. |
Persistent Queue
The gRPC publisher uses an on-disk persistent queue instead of an in-memory queue. This means queue sizes can be much larger (tens to hundreds of millions of entries) without impacting memory. The primary constraint is available disk space.
| Setting | Description |
|---|---|
| Maximum queue size | Maximum number of entries the queue can hold before oldest entries are discarded. |
| Warning queue size | Queue size at which a warning event is raised. Set to 50-80% of maximum. |
Always-on data collection: As of Mango 5.6.0, the gRPC publisher continuously collects point values and events into the persistent queue, even when the publisher is disabled. To stop data collection entirely, delete the publisher. Alternatively, set the maximum queue size to a low value to limit accumulation.
Estimating Queue Size
- Estimate the number of point values published per second per point (based on poll rate and publish type).
- Multiply by the number of data points.
- Add estimates for events and attribute updates if those features are enabled.
- Multiply by the time range you need the queue to cover (e.g., 3 days of offline tolerance).
- Use this figure as the maximum queue size.
- Estimate disk usage at approximately 40 bytes per entry for numeric values (alphanumeric values use more).
The publisher's status tab shows the current queue size, queue file size, and ingress rate for monitoring.
Republishing Past Data
The gRPC publisher supports republishing historical data to the data source. Select a date range and choose whether to republish point values and/or events, then press the republish button. Republishing is aborted if the queue reaches its warning threshold. Be aware that large republish operations can back up the queue and delay live data.
Legacy PTCP Data Source Configuration
For new installations, use the gRPC data source and publisher instead.
Data Source Settings
| Setting | Description |
|---|---|
| Name | A descriptive name for the data source. |
| Port | The TCP port on which the data source listens for publisher connections (1-65535). |
| Authorization key | A shared secret between the publisher and data source. Prevents unauthorized connections. |
| Accept point settings updates | When checked, point property changes sent by the publisher are applied locally. |
| Override permissions updates | When checked, new point permissions are set from the data source's configured permissions rather than the publisher's. |
| Use compression | Enable when the publisher sends compressed data. |
| Use CRC checksum | Validates data with a 32-bit CRC checksum. Adds 4 bytes per packet. Recommended for noisy transmission lines. |
| Use encryption | Encrypts data using AES. Requires a shared key in hexadecimal format. |
Encryption Key Sizes
| Key Size | Minimum Shared Key Length |
|---|---|
| 128-bit | 16 hexadecimal bytes |
| 192-bit | 20 hexadecimal bytes |
| 256-bit | 32 hexadecimal bytes |
Legacy PTCP Publisher Settings
| Setting | Description |
|---|---|
| Host | IP address or hostname of the remote Mango instance running the PTCP data source. |
| Port | Port number of the remote data source. |
| Authorization key | Must match the data source's authorization key. |
| XID prefix | Prefix for data point XIDs to ensure uniqueness on the central instance. |
| Synchronize historical data | Enables automatic gap-filling of historical data using a binary search algorithm. Configure with a cron pattern or predefined schedule. |
| Number of threads for sync | Higher values use more CPU but reduce synchronization time. |
| Transmit real time data | Whether to send real-time point values (used for display only, not saved on receiver). |
| Use CRC checksum | Must match the data source setting. |
| Use encryption | Must match the data source setting, including key and key size. |
Extra Tags
Both the data source and publisher can define extra tags added to data points during synchronization. If both define a tag with the same key, the data source's value takes precedence.
Setting Point Values
If a published point is settable on the publishing end, its corresponding persistent point on the receiving end is also settable. When you set a value on the receiver, it is transmitted back to the publisher and applied to the original point. The value is not saved locally on the receiver -- it is expected that the historical synchronization will push the confirmed value back as proof that the set was successful.
Common Patterns
Edge-to-Cloud Architecture
The most common topology is multiple edge Mango instances (at remote sites) each running a gRPC publisher that connects to a central cloud Mango instance running a gRPC data source. Each edge publisher uses a unique XID prefix (e.g., site-a-, site-b-) and extra tags (e.g., location=site-a) to identify the source of each data point on the central instance.
High-Availability Data Transfer
Use the gRPC publisher's persistent queue to handle network outages. Size the queue to hold several days of data, so that if the connection between edge and cloud is interrupted, no data is lost. When the connection is restored, the queue drains automatically and the central instance receives all missing data.
Hierarchical Aggregation
In large deployments, intermediate Mango instances can act as both data sources (receiving from edge sites) and publishers (forwarding to a central instance). This creates a hierarchical topology that reduces the number of direct connections to the central instance.
Troubleshooting
Publisher Cannot Connect
- Network connectivity -- verify the publisher can reach the data source's IP and port. Check firewalls on both sides.
- Authentication failure -- for gRPC, verify the authentication token is valid and not expired. For legacy PTCP, verify the authorization key matches exactly on both sides.
- TLS mismatch -- for gRPC, ensure TLS settings are consistent between client and server. Check that certificates are valid and trusted.
- Port conflict -- ensure the data source's listening port is not used by another service.
Data Points Not Synchronizing
- XID length exceeded -- if the XID prefix plus the original XID exceeds 100 characters, synchronization fails. Shorten the prefix.
- Permission sync mode -- check that permission synchronization settings are compatible between the publisher and data source.
- Full resynchronization needed -- some setting changes require a full resync. Restart the publisher to trigger it.
Queue Growing Without Draining
- Connection down -- the publisher may not be connected to the data source. Check events and logs for connection errors.
- Data source not accepting data -- the data source may be disabled or experiencing errors. Check events on the data source side.
- Max entries per request too low -- if the ingress rate exceeds the send rate, increase the max entries per request.
- Disk space -- monitor disk space. If the queue file grows too large, reduce the maximum queue size or add disk capacity.
Historical Sync Issues (Legacy PTCP)
- Duplicate values -- if using a minimum overlap greater than 0, ensure the receiving data store handles duplicates (MangoNoSQL does, SQL databases may not).
- Sync not completing -- check the sync response timeout. If it is too short, the data source cannot respond in time.
- Data gaps persist -- use the "Reset sync history" button to force the publisher to re-check all point histories. This can be an expensive operation.
Related Pages
- Data Sources Overview — General data source and data point concepts
- TCP/IP Data Source — General-purpose TCP/IP socket communication for custom protocols
- MQTT Client Data Source — Another event-driven data source for real-time messaging
- Data Source Performance — Tuning throughput and monitoring for high-speed data synchronization