Skip to content

[improve][broker]Part-3 of PIP-433: always allow replicator to register a new compatible schema#25461

Open
poorbarcode wants to merge 6 commits intoapache:masterfrom
poorbarcode:pip-433/p3
Open

[improve][broker]Part-3 of PIP-433: always allow replicator to register a new compatible schema#25461
poorbarcode wants to merge 6 commits intoapache:masterfrom
poorbarcode:pip-433/p3

Conversation

@poorbarcode
Copy link
Copy Markdown
Contributor

Motivation

This is part-3 of PIP-433, see the goal 1 in PIP-433

Modifications

always allow the replicator to register a new compatible schema, users can turn off this function

Documentation

  • doc
  • doc-required
  • doc-not-needed
  • doc-complete

@poorbarcode poorbarcode added this to the 5.0.0 milestone Apr 3, 2026
@poorbarcode poorbarcode self-assigned this Apr 3, 2026
@poorbarcode poorbarcode modified the milestones: 5.0.0, 4.3.0 Apr 3, 2026
@github-actions github-actions Bot added the doc-not-needed Your PR changes do not impact docs label Apr 3, 2026
Comment on lines +2753 to +2759
boolean isReplicatorProducer = false;
if (commandGetOrCreateSchema.hasProducerName()) {
isReplicatorProducer = Producer.isRemoteOrShadow(commandGetOrCreateSchema.getProducerName(),
getBrokerService().getPulsar().getConfig().getReplicatorPrefix());
}
CompletableFuture<SchemaVersion> schemaVersionFuture =
tryAddSchema(topic, schema, isReplicatorProducer);
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will it be a security issue that user can just set producer name to pulsar.repl.* to get rid of the schema upload control? if user disabled is_allow_auto_update_schema

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, it will.

Comment on lines +2166 to +2171
@Option(names = { "--disable-for-replicator" },
description = "By default, brokers always allow replicator to register new compatible schemas even"
+ " when auto updates are disabled, if you want to disable replicators to register compatible schemas,"
+ " please set it to true")
private boolean disableForReplicator;

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

--disable-for-replicator is a primitive boolean defaulting to false, so every set-is-allow-auto-update-schema call sends allowAutoUpdateSchemaWithReplicator=true unless the flag is supplied. A user who ran ... --disable --disable-for-replicator and later runs ... --enable silently re-enables replicator updates — different from the REST API's null-preserving semantics.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please also check the Java API

Copy link
Copy Markdown
Contributor Author

@poorbarcode poorbarcode Apr 20, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed

&& schema_compatibility_strategy == other.schema_compatibility_strategy
&& is_allow_auto_update_schema == other.is_allow_auto_update_schema
&& is_allow_auto_update_schema_with_replicator
== other.is_allow_auto_update_schema_with_replicator
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should use Objects.equals()?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good suggestion

description = "By default, brokers always allow replicator to register new compatible schemas even"
+ " when auto updates are disabled, if you want to disable replicators to register compatible schemas,"
+ " please set it to true")
private boolean disableForReplicator;
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

PIP says CLI flag is --enable-for-replicator, implementation uses --disable-for-replicator.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed

}
return policies;
}, (policies) -> policies.is_allow_auto_update_schema,
"isAllowAutoUpdateSchema");
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Missed isAllowAutoUpdateSchemaWithReplicator?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It has been modified at the line-2583

@Denovo1998
Copy link
Copy Markdown
Contributor

@poorbarcode @codelipenghui
Seems like there are some issues?

When geo-replicator creates a remote producer in the broker, it indeed carries the replicator's producer name and uses Schema.AUTO_PRODUCE_BYTES().

But the key point of AUTO_PRODUCE_BYTES is: when establishing a producer connection, it doesn't send the "specific schema of this message" all at once. The actual specific schema registration happens later during the message sending process, when encountering a new schema, it then triggers GET_OR_CREATE_SCHEMA. This call chain looks like this locally:

  1. When replicator acts as a producer to establish a connection, the ProducerImpl.connectionOpened(...) method sends the PRODUCER command, which includes the producerName. But later, when it actually needs to register a specific schema, it follows the path tryRegisterSchema(...) -> getOrCreateSchemaAsync(...).

  2. The method finally calls Commands' newGetOrCreateSchema(requestId, topic, schemaInfo).

  3. This command builder only wrote requestId/topic/schema, without writing producerName.

  4. So the CommandGetOrCreateSchema sent from the sender to the broker always has the field producerName absent.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

doc-not-needed Your PR changes do not impact docs ready-to-test

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants