Releases: redpanda-data/connect
v4.49.1
For installation instructions check out the getting started guide.
Added
- Output
snowflake_streaming
has two new statssnowflake_register_latency_ns
andsnowflake_commit_latency_ns
. (@rockwotj)
Changed
- Field
snapshot_memory_safety_factor
is now removed for inputpostgres_cdc
, the batch size must be explicitly defined, the batch size default is 1000. (@rockwotj) - Input
postgres_cdc
now supports intra-table snapshot read parallelism in addition to inter-table parallelism. (@rockwotj)
The full change log can be found here.
v4.48.1
For installation instructions check out the getting started guide.
Added
- Enterprise licenses can now be loaded directly from an environment variable
REDPANDA_LICENSE
. (@rockwotj) - Added a lint rule to verify field
private_key
for thesnowflake_streaming
output is in PEM format. (@rockwotj) - New
mongodb_cdc
input for change data capture (CDC) over MongoDB collections. (@rockwotj) - Field
is_high_watermark
added to theredpanda_migrator_offsets
output. (@mihaitodor) - Metadata field
kafka_is_high_watermark
added to theredpanda_migrator_offsets
input. (@mihaitodor) - Input
postgres_cdc
now emits logical messages to the WAL every hour by default to allow WAL reclaiming for low frequency tables, this frequency is controlled by fieldheartbeat_interval
. (@rockwotj) - Output
snowflake_streaming
now has acommit_timeout
field to control how long to wait for a commit in Snowflake. (@rockwotj) - Output
snowflake_streaming
now has aurl
field to override the hostname for connections to Snowflake, which is required for private link deployments. (@rockwotj) - All
sql_*
components now support theclickhouse
driver in cloud builds. (@mihaitodor)
Fixed
- Fix an issue in the
snowflake_streaming
output when the user manually evolves the schema in their pipeline that could lead to elevated error rates in the connector. (@rockwotj) - Fixed a bug with the
redpanda_migrator_offsets
input and output where the consumer group update migration logic based on timestamp lookup should no longer skip ahead in the destination cluster. This should enforce at-least-once delivery guarantees. (@mihaitodor) - The
redpanda_migrator_bundle
output no longer drops messages if either theredpanda_migrator
or theredpanda_migrator_offsets
child output throws an error. Connect will keep retrying to write the messages and apply backpressure to the input. (@mihaitodor) - Transient errors in
snowflake_streaming
are now automatically retried in cases it's determined to be safe to do. (@rockwotj) - Fixed a panic in the
sftp
input when Connect shuts down. (@mihaitodor) - Fixed an issue where
mysql_cdc
would not work with timestamps without theparseTime=true
DSN parameter. (@rockwotj) - Fixed an issue where timestamps at extreme year bounds (i.e. year 0 or year 9999) would be encoded incorrectly in
snowflake_streaming
. (@rockwotj) - The
aws_s3
input now drops SQS notifications and emits a warning log message for files which have been deleted before Connect was able to read them. (@mihaitodor) - Fixed a bug in
snowflake_streaming
where string/bytes values that are the min or max value for a column in a batch and were over 32 characters could be corrupted if the write was retried. (@rockwotj)
Changed
- Output
snowflake_streaming
has additional logging and debug information when errors arise. (@rockwotj) - Input
postgres_cdc
now does not add a prefix to the replication slot name, if upgrading from a previous version, prefix your current replication slot withrs_
to continue to use the same replication slot. (@rockwotj) - The
redpanda_migrator
output now uses the source topic config when creating a topic in the destination cluster. It also attempts to transfer topic ACLs to the destination cluster even if the topics already exist. (@mihaitodor) - When
preserve_logical_types
istrue
inschema_registry_decode
, convert time logical times into bloblang timestamps instead of duration strings. (@rockwotj)
The full change log can be found here.
v4.48.0-rc3
For installation instructions check out the getting started guide.
NOTE: This is a release candidate, you can download a binary from this page.
Added
- Enterprise licenses can now be loaded directly from an environment variable
REDPANDA_LICENSE
. (@rockwotj) - Added a lint rule to verify field
private_key
for thesnowflake_streaming
output is in PEM format. (@rockwotj) - New
mongodb_cdc
input for change data capture (CDC) over MongoDB collections. (@rockwotj) - Field
is_high_watermark
added to theredpanda_migrator_offsets
output. (@mihaitodor) - Metadata field
kafka_is_high_watermark
added to theredpanda_migrator_offsets
input. (@mihaitodor) - Input
postgres_cdc
now emits logical messages to the WAL every hour by default to allow WAL reclaiming for low frequency tables, this frequency is controlled by fieldheartbeat_interval
. (@rockwotj) - Output
snowflake_streaming
now has acommit_timeout
field to control how long to wait for a commit in Snowflake. (@rockwotj) - Output
snowflake_streaming
now has aurl
field to override the hostname for connections to Snowflake, which is required for private link deployments. (@rockwotj)
Fixed
- Fix an issue in the
snowflake_streaming
output when the user manually evolves the schema in their pipeline that could lead to elevated error rates in the connector. (@rockwotj) - Fixed a bug with the
redpanda_migrator_offsets
input and output where the consumer group update migration logic based on timestamp lookup should no longer skip ahead in the destination cluster. This should enforce at-least-once delivery guarantees. (@mihaitodor) - The
redpanda_migrator_bundle
output no longer drops messages if either theredpanda_migrator
or theredpanda_migrator_offsets
child output throws an error. Connect will keep retrying to write the messages and apply backpressure to the input. (@mihaitodor) - Transient errors in
snowflake_streaming
are now automatically retried in cases it's determined to be safe to do. (@rockwotj) - Fixed a panic in the
sftp
input when Connect shuts down. (@mihaitodor)
Changed
- Output
snowflake_streaming
has additional logging and debug information when errors arise. (@rockwotj) - Input
postgres_cdc
now does not add a prefix to the replication slot name, if upgrading from a previous version, prefix your current replication slot withrs_
to continue to use the same replication slot. (@rockwotj)
The full change log can be found here.
v4.48.0-rc2
For installation instructions check out the getting started guide.
NOTE: This is a release candidate, you can download a binary from this page.
Fixed
- Fixed a bug with the
redpanda_migrator_offsets
input and output where the consumer group update migration logic based on timestamp lookup should no longer skip ahead in the destination cluster. This should enforce at-least-once delivery guarantees. (@mihaitodor) - The
redpanda_migrator_bundle
output no longer drops messages if either theredpanda_migrator
or theredpanda_migrator_offsets
child output throws an error. Connect will keep retrying to write the messages and apply backpressure to the input. (@mihaitodor)
Added
- Added a lint rule to verify field
private_key
for thesnowflake_streaming
output is in PEM format. (@rockwotj) - New
mongodb_cdc
input for change data capture (CDC) over MongoDB collections. (@rockwotj) - Field
is_high_watermark
added to theredpanda_migrator_offsets
output. (@mihaitodor) - Metadata field
kafka_is_high_watermark
added to theredpanda_migrator_offsets
input. (@mihaitodor)
Fixed
- Fix an issue in the
snowflake_streaming
output when the user manually evolves the schema in their pipeline that could lead to elevated error rates in the connector. (@rockwotj)
Changed
- Output
snowflake_streaming
has additional logging and debug information when errors arise. (@rockwotj)
The full change log can be found here.
v4.48.0-rc1
For installation instructions check out the getting started guide.
NOTE: This is a release candidate, you can download a binary from this page.
Added
- Added a lint rule to verify field
private_key
for thesnowflake_streaming
output is in PEM format. (@rockwotj) - New
mongodb_cdc
input for change data capture (CDC) over MongoDB collections. (@rockwotj)
Fixed
- Fix an issue in the
snowflake_streaming
output when the user manually evolves the schema in their pipeline that could lead to elevated error rates in the connector. (@rockwotj)
Changed
- Output
snowflake_streaming
has additional logging and debug information when errors arise. (@rockwotj)
The full change log can be found here.
v4.47.1
For installation instructions check out the getting started guide.
Fixed
- Fix an issue with left over staging files being left around in the
snowflake_streaming
output. (@rockwotj)
The full change log can be found here.
v4.47.0
For installation instructions check out the getting started guide.
Added
- Field
arguments
added to theamqp_0_9
input and output. (@calini) - Field
avro.mapping
added to theschema_registry_decode
processor to support converting custom avro types to standard avro types for legacy tooling. (@rockwotj) - (Benthos) A
crash
processor for FATAL logging. (@rockwotj) - (Benthos) A
uuid_v7
bloblang function. (@rockwotj) - (Benthos) Field
disable_http2
added to thehttp_client
input and output and to thehttp
processor. (@mihaitodor) - New
elasticsearch_v8
output which supersedes the existingelasticsearch
output that uses a deprecated Elasticsearch library. (@ooesili) - Field
retry_on_conflict
added toelasticsearch
output to retry operations in case there are document version conflicts.
The full change log can be found here.
v4.47.0-rc2
For installation instructions check out the getting started guide.
NOTE: This is a release candidate, you can download a binary from this page.
Added
- Field
arguments
added to theamqp_0_9
input and output. (@calini) - Field
avro.mapping
added to theschema_registry_decode
processor to support converting custom avro types to standard avro types for legacy tooling. (@rockwotj) - (Benthos) A
crash
processor for FATAL logging. (@rockwotj) - (Benthos) A
uuid_v7
bloblang function. (@rockwotj) - (Benthos) Field
disable_http2
added to thehttp_client
input and output and to thehttp
processor. (@mihaitodor) - New
elasticsearch_v8
output which supersedes the existingelasticsearch
output that uses a deprecated Elasticsearch library. (@ooesili) - Field
retry_on_conflict
added toelasticsearch
output to retry operations in case there are document version conflicts.
The full change log can be found here.
v4.47.0-rc1
For installation instructions check out the getting started guide.
NOTE: This is a release candidate, you can download a binary from this page.
Added
- Field
arguments
added to theamqp_0_9
input and output. (@calini) - Field
avro.mapping
added to theschema_registry_decode
processor to support converting custom avro types to standard avro types for legacy tooling. (@rockwotj) - (Benthos) A
crash
processor for FATAL logging. (@rockwotj) - (Benthos) A
uuid_v7
bloblang function. (@rockwotj) - (Benthos) Field
disable_http2
added to thehttp_client
input and output and to thehttp
processor. (@mihaitodor)
The full change log can be found here.
v4.46.0
For installation instructions check out the getting started guide.
Added
- New
mysql_cdc
input supporting change data capture (CDC) from MySQL. (@rockwotj, @le-vlad) - Field
instance_id
added tokafka
,kafka_franz
,ockam_kafka
,redpanda
,redpanda_common
, andredpanda_migrator
inputs. (@rockwotj) - Fields
rebalance_timeout
,session_timeout
andheartbeat_interval
added to thekafka_franz
,redpanda
,redpanda_common
,redpanda_migrator
andockam_kafka
inputs. (@rockwotj) - Field
avro.preserve_logical_types
for processorschema_registry_decode
was added to preserve logical types instead of decoding them as their primitive representation. (@rockwotj) - Processor
schema_registry_decode
now adds metadataschema_id
for the schema's ID in the schema registry. (@rockwotj) - Field
schema_evolution.processors
added tosnowpipe_streaming
to support side effects or enrichment during schema evolution. (@rockwotj) - Field
unchanged_toast_value
added topostgres_cdc
to control the value substituted for unchanged toast values when a table does not have full replica identity. (@rockwotj)
Fixed
- Fix a snapshot stream consistency issue with
postgres_cdc
where data could be missed if writes were happening during the snapshot phase. (@rockwotj) - Fix an issue where
@table
metadata was quoted for the snapshot phase inpostgres_cdc
. (@rockwotj)
Changed
- Field
avro_raw_json
was deprecated in favor ofavro.raw_unions
for processorschema_registry_decode
. (@rockwotj) - The
snowpipe_streaming
output now has better error handling for authentication failures when uploading to cloud storage. (@rockwotj) - Field
schema_evolution.new_column_type_mapping
forsnowpipe_streaming
is deprecated and can be replaced withschema_evolution.processors
. (@rockwotj) - Increased the default values for
max_message_bytes
andbroker_write_max_bytes
by using IEC units instead of SI units. This better matches defaults in Redpanda and Kafka. (@rockwotj) - Dropped support for postgres 10 and 11 in
postgres_cdc
. (@rockwotj)
The full change log can be found here.