Related Pages
Related topics: Configuration The following files were used as context for generating this wiki page:- src/starfish/telemetry/init.py
- src/starfish/telemetry/posthog_client.py
- src/starfish/data_factory/factory_.py
- src/starfish/data_factory/utils/data_class.py
- src/starfish/common/env_loader.py
- src/starfish/llm/prompt/prompt_template.py
Telemetry
Telemetry within Starfish is designed to collect minimal and anonymous data to help improve the library. It provides insights into the usage and performance of different features, aiding in identifying areas for optimization and bug fixing. Participation is optional, and users can opt-out via an environment variable. The telemetry system usesposthog
to send events. It collects data related to job execution, platform, and configuration to provide insights into the library’s usage and performance. src/starfish/telemetry/posthog_client.py, src/starfish/data_factory/factory_.py
Telemetry Configuration
The telemetry system is configured using environment variables and a configuration file stored in the application data directory. src/starfish/telemetry/posthog_client.pyConfiguration Parameters
TheAnalyticsConfig
dataclass holds the configuration parameters for the telemetry service. src/starfish/telemetry/posthog_client.py
Parameter | Type | Description | Source |
---|---|---|---|
api_key | str | The API key for the analytics service (Posthog). | src/starfish/telemetry/posthog_client.py:41 |
active | bool | Flag to enable or disable telemetry. Defaults to True . | src/starfish/telemetry/posthog_client.py:42 |
verbose | bool | Flag for verbose logging. Defaults to False . | src/starfish/telemetry/posthog_client.py:43 |
endpoint | Optional[str] | Optional custom endpoint for the analytics service. | src/starfish/telemetry/posthog_client.py:44 |
Opting Out
Users can disable telemetry by setting theTELEMETRY_ENABLED
environment variable to false
. README.md
Telemetry Data Collection
The telemetry system collects data related to data factory jobs and sends it to the analytics service. src/starfish/data_factory/factory_.pyTelemetry Events
Telemetry events are represented by theEvent
dataclass, which includes the event name, data, and a unique client ID. src/starfish/telemetry/posthog_client.py
Data Factory Telemetry
TheDataFactory
class in src/starfish/data_factory/factory_.py sends telemetry events at the end of a job. This includes information about the job configuration, execution environment, and outcome. src/starfish/data_factory/factory_.py
The TelemetryData
dataclass is used to structure the data sent with the telemetry event. src/starfish/data_factory/utils/data_class.py
Telemetry Data Attributes
Attribute | Type | Description | Source |
---|---|---|---|
job_id | str | Identifier for the job. | src/starfish/data_factory/utils/data_class.py |
target_reached | bool | Whether the target count was achieved. | src/starfish/data_factory/utils/data_class.py |
run_mode | str | Execution mode of the job. | src/starfish/data_factory/utils/data_class.py |
num_inputs | int | Number of input records processed. | src/starfish/data_factory/utils/data_class.py |
library_version | str | Version of the processing library. | src/starfish/data_factory/utils/data_class.py |
config | dict | Configuration parameters for the job. | src/starfish/data_factory/utils/data_class.py |
error_summary | dict | Summary of errors encountered during the job. | src/starfish/data_factory/utils/data_class.py |
count_summary | dict | Summary of record counts (completed, failed, filtered). | src/starfish/data_factory/utils/data_class.py, src/starfish/data_factory/factory_.py |
run_time_platform | str | The platform on which the job is run. | src/starfish/data_factory/utils/data_class.py, src/starfish/data_factory/factory_.py |
Sending Telemetry
The_send_telemetry_event
method in the DataFactory
class is responsible for sending the telemetry data to the analytics service. src/starfish/data_factory/factory_.py
Analytics Service
TheAnalyticsService
class handles the communication with the analytics backend (Posthog). src/starfish/telemetry/posthog_client.py
Service Setup
The_setup_client
method configures the Posthog client. It checks if telemetry is active and initializes the client with the API key and endpoint. If telemetry is disabled, it uses a NoOpPosthog
client. src/starfish/telemetry/posthog_client.py
Capturing Events
Thecapture_event
method sends an event to the analytics service. It ensures that the event data includes the client ID. src/starfish/telemetry/posthog_client.py
Client ID Generation
TheTelemetryConfig
class is responsible for generating and retrieving a unique client identifier. The identifier is stored in a file in the application data directory. src/starfish/telemetry/posthog_client.py