Skip to content

Releases: redpanda-data/connect

v4.49.1

06 Mar 14:02
608ba75
Compare
Choose a tag to compare

For installation instructions check out the getting started guide.

Added

  • Output snowflake_streaming has two new stats snowflake_register_latency_ns and snowflake_commit_latency_ns. (@rockwotj)

Changed

  • Field snapshot_memory_safety_factor is now removed for input postgres_cdc, the batch size must be explicitly defined, the batch size default is 1000. (@rockwotj)
  • Input postgres_cdc now supports intra-table snapshot read parallelism in addition to inter-table parallelism. (@rockwotj)

The full change log can be found here.

v4.48.1

04 Mar 04:04
bc8c9e2
Compare
Choose a tag to compare

For installation instructions check out the getting started guide.

Added

  • Enterprise licenses can now be loaded directly from an environment variable REDPANDA_LICENSE. (@rockwotj)
  • Added a lint rule to verify field private_key for the snowflake_streaming output is in PEM format. (@rockwotj)
  • New mongodb_cdc input for change data capture (CDC) over MongoDB collections. (@rockwotj)
  • Field is_high_watermark added to the redpanda_migrator_offsets output. (@mihaitodor)
  • Metadata field kafka_is_high_watermark added to the redpanda_migrator_offsets input. (@mihaitodor)
  • Input postgres_cdc now emits logical messages to the WAL every hour by default to allow WAL reclaiming for low frequency tables, this frequency is controlled by field heartbeat_interval. (@rockwotj)
  • Output snowflake_streaming now has a commit_timeout field to control how long to wait for a commit in Snowflake. (@rockwotj)
  • Output snowflake_streaming now has a url field to override the hostname for connections to Snowflake, which is required for private link deployments. (@rockwotj)
  • All sql_* components now support the clickhouse driver in cloud builds. (@mihaitodor)

Fixed

  • Fix an issue in the snowflake_streaming output when the user manually evolves the schema in their pipeline that could lead to elevated error rates in the connector. (@rockwotj)
  • Fixed a bug with the redpanda_migrator_offsets input and output where the consumer group update migration logic based on timestamp lookup should no longer skip ahead in the destination cluster. This should enforce at-least-once delivery guarantees. (@mihaitodor)
  • The redpanda_migrator_bundle output no longer drops messages if either the redpanda_migrator or the redpanda_migrator_offsets child output throws an error. Connect will keep retrying to write the messages and apply backpressure to the input. (@mihaitodor)
  • Transient errors in snowflake_streaming are now automatically retried in cases it's determined to be safe to do. (@rockwotj)
  • Fixed a panic in the sftp input when Connect shuts down. (@mihaitodor)
  • Fixed an issue where mysql_cdc would not work with timestamps without the parseTime=true DSN parameter. (@rockwotj)
  • Fixed an issue where timestamps at extreme year bounds (i.e. year 0 or year 9999) would be encoded incorrectly in snowflake_streaming. (@rockwotj)
  • The aws_s3 input now drops SQS notifications and emits a warning log message for files which have been deleted before Connect was able to read them. (@mihaitodor)
  • Fixed a bug in snowflake_streaming where string/bytes values that are the min or max value for a column in a batch and were over 32 characters could be corrupted if the write was retried. (@rockwotj)

Changed

  • Output snowflake_streaming has additional logging and debug information when errors arise. (@rockwotj)
  • Input postgres_cdc now does not add a prefix to the replication slot name, if upgrading from a previous version, prefix your current replication slot with rs_ to continue to use the same replication slot. (@rockwotj)
  • The redpanda_migrator output now uses the source topic config when creating a topic in the destination cluster. It also attempts to transfer topic ACLs to the destination cluster even if the topics already exist. (@mihaitodor)
  • When preserve_logical_types is true in schema_registry_decode, convert time logical times into bloblang timestamps instead of duration strings. (@rockwotj)

The full change log can be found here.

v4.48.0-rc3

26 Feb 02:46
a078abc
Compare
Choose a tag to compare
v4.48.0-rc3 Pre-release
Pre-release

For installation instructions check out the getting started guide.

NOTE: This is a release candidate, you can download a binary from this page.

Added

  • Enterprise licenses can now be loaded directly from an environment variable REDPANDA_LICENSE. (@rockwotj)
  • Added a lint rule to verify field private_key for the snowflake_streaming output is in PEM format. (@rockwotj)
  • New mongodb_cdc input for change data capture (CDC) over MongoDB collections. (@rockwotj)
  • Field is_high_watermark added to the redpanda_migrator_offsets output. (@mihaitodor)
  • Metadata field kafka_is_high_watermark added to the redpanda_migrator_offsets input. (@mihaitodor)
  • Input postgres_cdc now emits logical messages to the WAL every hour by default to allow WAL reclaiming for low frequency tables, this frequency is controlled by field heartbeat_interval. (@rockwotj)
  • Output snowflake_streaming now has a commit_timeout field to control how long to wait for a commit in Snowflake. (@rockwotj)
  • Output snowflake_streaming now has a url field to override the hostname for connections to Snowflake, which is required for private link deployments. (@rockwotj)

Fixed

  • Fix an issue in the snowflake_streaming output when the user manually evolves the schema in their pipeline that could lead to elevated error rates in the connector. (@rockwotj)
  • Fixed a bug with the redpanda_migrator_offsets input and output where the consumer group update migration logic based on timestamp lookup should no longer skip ahead in the destination cluster. This should enforce at-least-once delivery guarantees. (@mihaitodor)
  • The redpanda_migrator_bundle output no longer drops messages if either the redpanda_migrator or the redpanda_migrator_offsets child output throws an error. Connect will keep retrying to write the messages and apply backpressure to the input. (@mihaitodor)
  • Transient errors in snowflake_streaming are now automatically retried in cases it's determined to be safe to do. (@rockwotj)
  • Fixed a panic in the sftp input when Connect shuts down. (@mihaitodor)

Changed

  • Output snowflake_streaming has additional logging and debug information when errors arise. (@rockwotj)
  • Input postgres_cdc now does not add a prefix to the replication slot name, if upgrading from a previous version, prefix your current replication slot with rs_ to continue to use the same replication slot. (@rockwotj)

The full change log can be found here.

v4.48.0-rc2

20 Feb 20:29
b498a31
Compare
Choose a tag to compare
v4.48.0-rc2 Pre-release
Pre-release

For installation instructions check out the getting started guide.

NOTE: This is a release candidate, you can download a binary from this page.

Fixed

  • Fixed a bug with the redpanda_migrator_offsets input and output where the consumer group update migration logic based on timestamp lookup should no longer skip ahead in the destination cluster. This should enforce at-least-once delivery guarantees. (@mihaitodor)
  • The redpanda_migrator_bundle output no longer drops messages if either the redpanda_migrator or the redpanda_migrator_offsets child output throws an error. Connect will keep retrying to write the messages and apply backpressure to the input. (@mihaitodor)

Added

  • Added a lint rule to verify field private_key for the snowflake_streaming output is in PEM format. (@rockwotj)
  • New mongodb_cdc input for change data capture (CDC) over MongoDB collections. (@rockwotj)
  • Field is_high_watermark added to the redpanda_migrator_offsets output. (@mihaitodor)
  • Metadata field kafka_is_high_watermark added to the redpanda_migrator_offsets input. (@mihaitodor)

Fixed

  • Fix an issue in the snowflake_streaming output when the user manually evolves the schema in their pipeline that could lead to elevated error rates in the connector. (@rockwotj)

Changed

  • Output snowflake_streaming has additional logging and debug information when errors arise. (@rockwotj)

The full change log can be found here.

v4.48.0-rc1

20 Feb 01:56
Compare
Choose a tag to compare
v4.48.0-rc1 Pre-release
Pre-release

For installation instructions check out the getting started guide.

NOTE: This is a release candidate, you can download a binary from this page.

Added

  • Added a lint rule to verify field private_key for the snowflake_streaming output is in PEM format. (@rockwotj)
  • New mongodb_cdc input for change data capture (CDC) over MongoDB collections. (@rockwotj)

Fixed

  • Fix an issue in the snowflake_streaming output when the user manually evolves the schema in their pipeline that could lead to elevated error rates in the connector. (@rockwotj)

Changed

  • Output snowflake_streaming has additional logging and debug information when errors arise. (@rockwotj)

The full change log can be found here.

v4.47.1

10 Feb 23:12
Compare
Choose a tag to compare

For installation instructions check out the getting started guide.

Fixed

  • Fix an issue with left over staging files being left around in the snowflake_streaming output. (@rockwotj)

The full change log can be found here.

v4.47.0

07 Feb 20:09
0953711
Compare
Choose a tag to compare

For installation instructions check out the getting started guide.

Added

  • Field arguments added to the amqp_0_9 input and output. (@calini)
  • Field avro.mapping added to the schema_registry_decode processor to support converting custom avro types to standard avro types for legacy tooling. (@rockwotj)
  • (Benthos) A crash processor for FATAL logging. (@rockwotj)
  • (Benthos) A uuid_v7 bloblang function. (@rockwotj)
  • (Benthos) Field disable_http2 added to the http_client input and output and to the http processor. (@mihaitodor)
  • New elasticsearch_v8 output which supersedes the existing elasticsearch output that uses a deprecated Elasticsearch library. (@ooesili)
  • Field retry_on_conflict added to elasticsearch output to retry operations in case there are document version conflicts.

The full change log can be found here.

v4.47.0-rc2

06 Feb 23:16
8706c1e
Compare
Choose a tag to compare
v4.47.0-rc2 Pre-release
Pre-release

For installation instructions check out the getting started guide.

NOTE: This is a release candidate, you can download a binary from this page.

Added

  • Field arguments added to the amqp_0_9 input and output. (@calini)
  • Field avro.mapping added to the schema_registry_decode processor to support converting custom avro types to standard avro types for legacy tooling. (@rockwotj)
  • (Benthos) A crash processor for FATAL logging. (@rockwotj)
  • (Benthos) A uuid_v7 bloblang function. (@rockwotj)
  • (Benthos) Field disable_http2 added to the http_client input and output and to the http processor. (@mihaitodor)
  • New elasticsearch_v8 output which supersedes the existing elasticsearch output that uses a deprecated Elasticsearch library. (@ooesili)
  • Field retry_on_conflict added to elasticsearch output to retry operations in case there are document version conflicts.

The full change log can be found here.

v4.47.0-rc1

04 Feb 19:04
2a7039f
Compare
Choose a tag to compare
v4.47.0-rc1 Pre-release
Pre-release

For installation instructions check out the getting started guide.

NOTE: This is a release candidate, you can download a binary from this page.

Added

  • Field arguments added to the amqp_0_9 input and output. (@calini)
  • Field avro.mapping added to the schema_registry_decode processor to support converting custom avro types to standard avro types for legacy tooling. (@rockwotj)
  • (Benthos) A crash processor for FATAL logging. (@rockwotj)
  • (Benthos) A uuid_v7 bloblang function. (@rockwotj)
  • (Benthos) Field disable_http2 added to the http_client input and output and to the http processor. (@mihaitodor)

The full change log can be found here.

v4.46.0

29 Jan 16:21
83ad4d1
Compare
Choose a tag to compare

For installation instructions check out the getting started guide.

Added

  • New mysql_cdc input supporting change data capture (CDC) from MySQL. (@rockwotj, @le-vlad)
  • Field instance_id added to kafka, kafka_franz, ockam_kafka, redpanda, redpanda_common, and redpanda_migrator inputs. (@rockwotj)
  • Fields rebalance_timeout, session_timeout and heartbeat_interval added to the kafka_franz, redpanda, redpanda_common, redpanda_migrator and ockam_kafka inputs. (@rockwotj)
  • Field avro.preserve_logical_types for processor schema_registry_decode was added to preserve logical types instead of decoding them as their primitive representation. (@rockwotj)
  • Processor schema_registry_decode now adds metadata schema_id for the schema's ID in the schema registry. (@rockwotj)
  • Field schema_evolution.processors added to snowpipe_streaming to support side effects or enrichment during schema evolution. (@rockwotj)
  • Field unchanged_toast_value added to postgres_cdc to control the value substituted for unchanged toast values when a table does not have full replica identity. (@rockwotj)

Fixed

  • Fix a snapshot stream consistency issue with postgres_cdc where data could be missed if writes were happening during the snapshot phase. (@rockwotj)
  • Fix an issue where @table metadata was quoted for the snapshot phase in postgres_cdc. (@rockwotj)

Changed

  • Field avro_raw_json was deprecated in favor of avro.raw_unions for processor schema_registry_decode. (@rockwotj)
  • The snowpipe_streaming output now has better error handling for authentication failures when uploading to cloud storage. (@rockwotj)
  • Field schema_evolution.new_column_type_mapping for snowpipe_streaming is deprecated and can be replaced with schema_evolution.processors. (@rockwotj)
  • Increased the default values for max_message_bytes and broker_write_max_bytes by using IEC units instead of SI units. This better matches defaults in Redpanda and Kafka. (@rockwotj)
  • Dropped support for postgres 10 and 11 in postgres_cdc. (@rockwotj)

The full change log can be found here.