Skip to content

Conversation

@jojochuang
Copy link
Contributor

What changes were proposed in this pull request?

HDDS-13579. [Docs] Explain how Ratis write pipelines are calculated

Please describe your PR in detail:

What is the link to the Apache JIRA

https://issues.apache.org/jira/browse/HDDS-13579

How was this patch tested?

Doc only.

@szetszwo
Copy link
Contributor

szetszwo commented Jan 3, 2026

@jojochuang , thanks for working on this!

Could you limit the line length to 120 characters like the code? It is easier to comment on it.

Discovered and corrected the documentation for how the number of
EC pipelines is calculated. The previous analysis was incorrect.

- `ErasureCoding.md` is updated to describe the two new properties
  `ozone.scm.ec.pipeline.minimum` and
  `ozone.scm.ec.pipeline.per.volume.factor` and the `max()` logic
  used to determine the target number of pipelines.

- `ProductionDeployment.md` is updated to reference the correct
  and existing configuration property for tuning EC pipelines.

Change-Id: I393dc60d8745da2b2bb7899530665a108956446d
Change-Id: I0ec667c21155436eb6a0654782b43b48636f75d5
@jojochuang
Copy link
Contributor Author

thanks for review. updated.

@adoroszlai adoroszlai added the documentation Improvements or additions to documentation label Jan 5, 2026
Copy link
Contributor

@ashishkumar50 ashishkumar50 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jojochuang Thanks for writing this up.

recommended, as it allows the cluster to scale pipeline capacity naturally with its resources. You can use the
global limit (`ozone.scm.ratis.pipeline.limit`) as a safety cap if needed. A good starting value for
`ozone.scm.pipeline.per.metadata.disk` is **2**. Monitor the `NumOpenPipelines` metric in SCM to see if the
actual number of pipelines aligns with your configured targets.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think NumOpenPipelines metrics doesn't exist. We can either use admin command to see number of open pipelines or may use recon as well to see the open pipelines.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To clarify, the SCM web UI has a section Pipeline Statistics, and it pulls the metrics from:

{
"name": "Hadoop:service=SCMPipelineManager,name=SCMPipelineManagerInfo",
"modelerType": "org.apache.hadoop.hdds.scm.pipeline.PipelineManagerImpl",
"PipelineInfo": [
  {
    "key": "CLOSED",
    "value": 0
  },
  {
    "key": "ALLOCATED",
    "value": 0
  },
  {
    "key": "OPEN",
    "value": 1
  },
  {
    "key": "DORMANT",
    "value": 0
  }
  ]
},

Change-Id: I2537905761cf45d23cdb3701b2f0c94e7ff2485a
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

documentation Improvements or additions to documentation

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants