Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

editions: options interpretation can generate incorrect results #16756

Closed
jhump opened this issue May 6, 2024 · 3 comments · Fixed by #16864 or bufbuild/protocompile#309
Closed

editions: options interpretation can generate incorrect results #16756

jhump opened this issue May 6, 2024 · 3 comments · Fixed by #16864 or bufbuild/protocompile#309
Assignees
Labels

Comments

@jhump
Copy link
Contributor

jhump commented May 6, 2024

For Editions, it was necessary to split options interpretation into two steps:

  1. First the compiler interprets non-extension options (known options, no custom options). This is done by skipping any option where the first name element is an extension. This is necessary because features is such an option, and they must be interpreted before we can correctly interpret custom options (which may rely on features for correct serialization of those options).
  2. Then the compiler interprets the remaining extension options (aka custom options).

Unfortunately, however, because features is backed by an extendable message, the above is not adequate. It is still possible to define a source file where the features are not correctly serialized and/or incorrectly validated because they include an extension defined in the same file, whose features themselves have not yet been interpreted:

// test.proto
edition = "2023";

import "google/protobuf/descriptor.proto";

message Foo {
  string bar = 1 [
    // We don't yet know how to serialize extension custom at this point.
    features.(custom).foo = "xyz",
    // This is actually not valid: the compiler should really reject this file.
    // But we don't yet know that another.foo is a closed enum and thus
    // don't yet know that -321 is not valid.
    features.(another) = {foo: -321}
  ];
}

message Custom {
  // Here's a tricky case where even complex DAG ordering of interpreting
  // options wouldn't necessarily solve the issue because this element
  // indirectly depends on itself: Custom -> custom -> Custom.
  string foo = 1 [features = { [custom]: {foo: "abc"} }];
}
message Another {
  Enum foo = 1;
}

enum Enum {
  option features.enum_type = CLOSED;
  ZERO = 0;
  ONE = 1;
}

extend google.protobuf.FeatureSet {
  Custom custom = 1000 [features.message_encoding=DELIMITED];
  Another another = 1001;
}

The result of the above is a descriptor set where the Foo.bar field has unrecognized fields, because the (custom) extension is not serialized correctly. It is serialized using normal (length-prefixed) message encoding, though the field has a feature that suggests it should instead be delimited (aka group encoding).

$ protoc test.proto -o /dev/stdout | protoc test.proto --decode google.protobuf.FileDescriptorSet
file {
  name: "test.proto"
  dependency: "google/protobuf/descriptor.proto"
  message_type {
    name: "Foo"
    field {
      name: "bar"
      number: 1
      label: LABEL_OPTIONAL
      type: TYPE_STRING
      options {
        features {
          [another] {
            foo: ONE
          }
          1000 {
            1: "xyz"
          }
        }
      }
      json_name: "bar"
    }
  }
  message_type {
    name: "Custom"
    field {
      name: "foo"
      number: 1
      label: LABEL_OPTIONAL
      type: TYPE_STRING
      json_name: "foo"
    }
  }
  message_type {
    name: "Another"
    field {
      name: "foo"
      number: 1
      label: LABEL_OPTIONAL
      type: TYPE_ENUM
      type_name: ".Enum"
      json_name: "foo"
    }
  }
  enum_type {
    name: "Enum"
    value {
      name: "ZERO"
      number: 0
    }
    value {
      name: "ONE"
      number: 1
    }
    options {
      features {
        enum_type: CLOSED
      }
    }
  }
  extension {
    name: "custom"
    extendee: ".google.protobuf.FeatureSet"
    number: 1000
    label: LABEL_OPTIONAL
    type: TYPE_MESSAGE
    type_name: ".Custom"
    options {
      features {
        message_encoding: DELIMITED
      }
    }
    json_name: "custom"
  }
  extension {
    name: "another"
    extendee: ".google.protobuf.FeatureSet"
    number: 1001
    label: LABEL_OPTIONAL
    type: TYPE_MESSAGE
    type_name: ".Another"
    json_name: "another"
  }
  syntax: "editions"
  edition: EDITION_2023
}

As seen in the above output, the (custom) feature message is not recognized because it uses incorrect encoding. Here's the relevant portion:

          1000 {
            1: "xyz"
          }

Also, a bit surprisingly, the [another].foo field has the wrong value:

          [another] {
            foo: ONE
          }

Per closed-enum semantics, I would have expected foo to remain unset and there to be an unrecognized value with -321. Probing a little further, using --decode_raw, we see that the value was serialized as 1 (!!):

1: <
  1: "test.proto"
  3: "google/protobuf/descriptor.proto"
  4: <
    1: "Foo"
    2: <
      1: "bar"
      3: 1
      4: 1
      5: 9
      8: <
        21: <
          1000: <
            1: "xyz"
          >
          1001: <
            1: 1
          >
        >
      >
      10: "bar"
    >
  >
  ... /* rest of output truncated since it's not relevant */ ...
>

Super-strange that it serialized the value 1 instead of -321 (which --decode_raw would show as 18446744073709551295). (I'll file a separate bug for some pretty strange things I'm seeing in this area.)

I don't think there's any trivial way to remedy this. But I think it probably could be handled like so:

  1. Update the first "non-extension" pass of interpreting options to ignore options where any name component is an extension (instead of only looking at the first name component). This would effectively defer interpreting those options in Foo until after the extensions and enums' features are interpreted.
  2. Detect the case where a message literal is used (like in Custom.foo) and complain. By requiring it to be de-structured into multiple option statements, we can successfully process all core features/options first, and then interpret the self-reference after the extension and message's features have been processed.

The main case where this would still not work is if the non-extension options/features had their own messy references or even cycles. But that could only happen if descriptor.proto were migrated to Editions and it included new options/features that could have this sort of cycle (where they depend on themselves to describe their own features). This is highly unlikely to ever be an issue (though it would be nice if the compiler could at least detect the case, just in case, and complain).

@mkruskal-google
Copy link
Member

I think this also suffers the issue I pointed out in #16757, where overriding extension 1000 causes some interesting issues. Is there a unique problem here if that's changed to anything else?

I suspect there might still be an underlying bug when you use custom features in the same file they're defined in. We've been assuming that's a no-go situation and feature definitions can only change their own values on edition bumps. We could probably lock that down with some extra validation, although the most straight-forward place to put it would be during defaults compilation (which wouldn't stop your example from building)

copybara-service bot pushed a commit that referenced this issue May 7, 2024
This will prevent users from accidentally overriding these with different types (e.g. #16757 and #16756).

PiperOrigin-RevId: 631518986
@jhump
Copy link
Contributor Author

jhump commented May 7, 2024

I suspect there might still be an underlying bug when you use custom features in the same file they're defined in.

Yes, this is the crux of the issue. When interpreting options within a file, the current approach, to partition options into non-extension options and extension options, isn't sufficient.

Banning the use of a feature from the file in which it is defined would indeed work, but would be problematic if/when descriptor.proto itself is ever moved to editions. It could never can be moved to editions with such a rule in place, since it would be unable to ever override any feature values (like even to preserve semantics if moving to a newer edition) since all non-extension features are defined in that file.

Is there a unique problem here if that's changed to anything else?

If it's changed to something else then there is still the issue, similar to the incorrect message encoding, that it would treat the enum as open since it hasn't yet learned that it is closed. So it ends up allowing the -321 unrecognized numeric value for a closed enum, similar to how it serializes a delimited message using length-prefixed encoding.

@mkruskal-google
Copy link
Member

Yea migrating descriptor.proto to editions has a lot of potential problems (e.g. it can't use language features so it will always be a behavior change), we could always loosen this requirement later if we decide we want to.

copybara-service bot pushed a commit that referenced this issue May 9, 2024
This will prevent users from accidentally overriding these with different types (e.g. #16757 and #16756).

PiperOrigin-RevId: 631518986
copybara-service bot pushed a commit that referenced this issue May 9, 2024
This will prevent users from accidentally overriding these with different types (e.g. #16757 and #16756).

PiperOrigin-RevId: 631518986
copybara-service bot pushed a commit that referenced this issue May 9, 2024
This will prevent users from accidentally overriding these with different types (e.g. #16757 and #16756).

PiperOrigin-RevId: 631518986
mkruskal-google added a commit to mkruskal-google/protobuf that referenced this issue May 13, 2024
This will prevent users from accidentally overriding these with different types (e.g. protocolbuffers#16757 and protocolbuffers#16756).

PiperOrigin-RevId: 631518986
copybara-service bot pushed a commit that referenced this issue May 15, 2024
This will prevent users from accidentally overriding these with different types (e.g. #16757 and #16756).

PiperOrigin-RevId: 633760581
copybara-service bot pushed a commit that referenced this issue May 16, 2024
This is an edge case we can't handle properly today.  Rather than allowing poorly defined behavior, we'll make this an error condition until we can actually support it.  In the future, it may be necessary to upgrade feature files to newer editions.

Closes #16756

PiperOrigin-RevId: 634122584
copybara-service bot pushed a commit that referenced this issue May 16, 2024
This is an edge case we can't handle properly today.  Rather than allowing poorly defined behavior, we'll make this an error condition until we can actually support it.  In the future, it may be necessary to upgrade feature files to newer editions.

Closes #16756

PiperOrigin-RevId: 634122584
mkruskal-google added a commit that referenced this issue May 16, 2024
This is an edge case we can't handle properly today.  Rather than allowing poorly defined behavior, we'll make this an error condition until we can actually support it.  In the future, it may be necessary to upgrade feature files to newer editions.

Closes #16756

PiperOrigin-RevId: 634122584
copybara-service bot pushed a commit that referenced this issue May 16, 2024
This is an edge case we can't handle properly today.  Rather than allowing poorly defined behavior, we'll make this an error condition until we can actually support it.  In the future, it may be necessary to upgrade feature files to newer editions.

Closes #16756

PiperOrigin-RevId: 634122584
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment