python - How to preserve quotes in YAML when processing annotations data using PyYAML - Stack Overflow

admin2025-04-15  2

I'm working with a YAML file that contains annotations with specific values that need to be quoted. Here's an example of my input:

example/:
 1. catalog-item-20='server1.test.local:80'
 2. network-policy-version=v14.yaml
openshift.io/:
 3. sa.scc.mcs='s0,c107,c49'
collectord.io/:
 4. logs-index=channel_1
 5. logs-override.11-match='^.*(%SENSITIVE%).*$' 

I need to process this YAML and output it with proper quoting for values containing special characters. The desired output should look like this:

annotations:
    example/catalog-item-20: 'server1.test.local:80'
    bnhp.co.il/network-policy-version: v14.yaml
    openshift.io/sa.scc.mcs: 's0,c107,c49'
    collectord.io/logs-index: channel_1
    collectord.io/logs-override.11-match: ^.*(%SENSITIVE%).*$

Notice that:

  1. Values that come with quotes in the input should remain quoted in the output
  2. Values without quotes should remain unquoted
  3. Quote preservation should be independent of the value's content (special characters, commas, colons, etc.)

Here's my current code:

import yaml

with open('annotations.yaml', 'r') as f:
    raw_annotations = yaml.safe_load(f)
    
annotations = {}
for annotations_prefix, annotations_body in raw_annotations.items():
    prefix = annotations_prefix if annotations_prefix.endswith('/') else f"{annotations_prefix}/"
    for value in annotations_body:
        if '=' in value:
            annotation_key, annotation_value = value.split('=', 1)
            if annotation_value.startswith("'") and annotation_value.endswith("'"):
                annotation_value = annotation_value[1:-1]
            full_key = f"{prefix}{annotation_key}"
            annotations[full_key] = annotation_value

namespace_content = {
    'apiVersion': 'v1',
    'kind': 'Namespace',
    'metadata': {
        'annotations': annotations
    }
}

with open('namespace.yaml', 'w') as f:
    yaml.dump(namespace_content, f, default_flow_style=False)

But this produces output without proper quotes:

annotations:
    example/catalog-item-20: server1.test.local:80
    openshift.io/sa.scc.mcs: s0,c107,c49

I've tried:

  1. Using default_style="'" but this quotes everything
  2. Using ruamel.yaml with preserve_quotes=True but it didn't help
  3. Using yamlcore package but it also didn't preserve quotes

How can I make PyYAML preserve the quotes exactly as they appear in the input file?

I'm working with a YAML file that contains annotations with specific values that need to be quoted. Here's an example of my input:

example.com/:
 1. catalog-item-20='server1.test.local:80'
 2. network-policy-version=v14.yaml
openshift.io/:
 3. sa.scc.mcs='s0,c107,c49'
collectord.io/:
 4. logs-index=channel_1
 5. logs-override.11-match='^.*(%SENSITIVE%).*$' 

I need to process this YAML and output it with proper quoting for values containing special characters. The desired output should look like this:

annotations:
    example.com/catalog-item-20: 'server1.test.local:80'
    bnhp.co.il/network-policy-version: v14.yaml
    openshift.io/sa.scc.mcs: 's0,c107,c49'
    collectord.io/logs-index: channel_1
    collectord.io/logs-override.11-match: ^.*(%SENSITIVE%).*$

Notice that:

  1. Values that come with quotes in the input should remain quoted in the output
  2. Values without quotes should remain unquoted
  3. Quote preservation should be independent of the value's content (special characters, commas, colons, etc.)

Here's my current code:

import yaml

with open('annotations.yaml', 'r') as f:
    raw_annotations = yaml.safe_load(f)
    
annotations = {}
for annotations_prefix, annotations_body in raw_annotations.items():
    prefix = annotations_prefix if annotations_prefix.endswith('/') else f"{annotations_prefix}/"
    for value in annotations_body:
        if '=' in value:
            annotation_key, annotation_value = value.split('=', 1)
            if annotation_value.startswith("'") and annotation_value.endswith("'"):
                annotation_value = annotation_value[1:-1]
            full_key = f"{prefix}{annotation_key}"
            annotations[full_key] = annotation_value

namespace_content = {
    'apiVersion': 'v1',
    'kind': 'Namespace',
    'metadata': {
        'annotations': annotations
    }
}

with open('namespace.yaml', 'w') as f:
    yaml.dump(namespace_content, f, default_flow_style=False)

But this produces output without proper quotes:

annotations:
    example.com/catalog-item-20: server1.test.local:80
    openshift.io/sa.scc.mcs: s0,c107,c49

I've tried:

  1. Using default_style="'" but this quotes everything
  2. Using ruamel.yaml with preserve_quotes=True but it didn't help
  3. Using yamlcore package but it also didn't preserve quotes

How can I make PyYAML preserve the quotes exactly as they appear in the input file?

Share Improve this question asked Feb 4 at 15:26 user2339149user2339149 111 bronze badge 4
  • In general you can't, by design, when the YAML specification doesn't require those quotes: they're syntax, but where discardable they're syntax without any semantically meaningful value, so any parser that cares if they're discarded or not is by definition buggy. Don't use buggy tools to read your YAML documents (that goes double for tools like sed or awk that don't know anything about the YAML spec at all). If you want a compatible language with less pollution by TIMTOWTDI, consider JSON instead ("compatible" in the sense that all valid JSON is also valid YAML). – Charles Duffy Commented Feb 4 at 15:48
  • (To put it differently, YAML explicitly makes quotes optional when there's no ambiguity created by their absence, so there's nothing improper about leaving them out; the reference to "proper" quotes isn't really something that makes sense in this context) – Charles Duffy Commented Feb 4 at 15:52
  • 1 I don't think any of the values you show in annotations: require quotes one way or another, and to @CharlesDuffy's point they won't be meaningful to Kubernetes. A well-maintained YAML library should do the right thing when it's serializing data structures. – David Maze Commented Feb 4 at 16:14
  • @CharlesDuffy it's true that quotes are not meant to be preserved; same as mapping order or comments. But a library that optionally supports to preserve this is not "buggy by definition". There can be use cases where it's helpful to not lose this kind of information. – tinita Commented Feb 9 at 15:23
Add a comment  | 

1 Answer 1

Reset to default 0

Your input is valid YAML, but there is no way the putput you present comes from the input and the program you specify:

  • there is no aPIVersion key at the root level of your expected output (and other stuff missing)
  • nothing in your code removes the numbering 1. to 5.
  • you don't call splitlines on annotations_body, which is a multi-line string value

That is a bit too much to correct your program, but in general in ruamel.yaml setting .preserve_quotes only affects loaded strings and not newly created Python strings. You will have to create the special ruamel.yaml string subclasses that give you single quotes:

import sys
import ruamel.yaml
from pathlib import Path


def SQ(s):
    return ruamel.yaml.scalarstring.SingleQuotedScalarString(s)


data = {'annotations': {
    'example.com/catalog-item-20': SQ('server1.test.local:80'),
    'bnhp.co.il/network-policy-version': 'v14.yaml',
    'openshift.io/sa.scc.mcs': SQ('s0,c107,c49'),
    'collectord.io/logs-index': 'channel_1',
    'collectord.io/logs-override.11-match': '^.*(%SENSITIVE%).*$',
    }}


output = Path('namespace.yaml')

yaml = ruamel.yaml.YAML()
yaml.indent(mapping=4)

yaml.dump(data, output)
sys.stdout.write(output.read_text())

which gives:

annotations:
    example.com/catalog-item-20: 'server1.test.local:80'
    bnhp.co.il/network-policy-version: v14.yaml
    openshift.io/sa.scc.mcs: 's0,c107,c49'
    collectord.io/logs-index: channel_1
    collectord.io/logs-override.11-match: ^.*(%SENSITIVE%).*$

But you only have to do that if you process the output with a broken YAML parser (or some non-YAML tool), as these quotes are superfluous.

转载请注明原文地址:http://www.anycun.com/QandA/1744710825a86551.html