Skip to content

Conversation

@martin-gpy
Copy link
Contributor

During nvme connect-all, if a discovery log page record reports the sectype as anything other than NVMF_TCP_SECTYPE_NONE in nvmf_connect_disc_entry(), it then assumes that tls should be default set for the same. But this holds true only for configured PSK TLS connections alone and not generated PSK TLS (i.e. secure channel concat) connections since both concat and tls flags are meant to be mutually exclusive. Fix the same.

Fixes: #3034

During nvme connect-all, if a discovery log page record reports
the sectype as anything other than NVMF_TCP_SECTYPE_NONE in
nvmf_connect_disc_entry(), it then assumes that tls should be
default set for the same. But this holds true only for configured
PSK TLS connections alone and not generated PSK TLS (i.e. secure
channel concat) connections since both concat and tls flags are
meant to be mutually exclusive. Fix the same.

Signed-off-by: Martin George <marting@netapp.com>
@igaw
Copy link
Collaborator

igaw commented Jan 5, 2026

I think the nvmf_connect_disc_entry function should not blindly enable tls in this case.

Something like this here:

	if (e->trtype == NVMF_TRTYPE_TCP &&
	    e->tsas.tcp.sectype != NVMF_TCP_SECTYPE_NONE &&
	    !cfg->tls)
		c->cfg.tls = true;

If we check the command line for --concat and --tls and ensure that the user doesn't provide both options, then this here would be a good place to log what is happening. It seems relevant to report to me. What do you think?

@martin-gpy
Copy link
Contributor Author

martin-gpy commented Jan 5, 2026

I think the nvmf_connect_disc_entry function should not blindly enable tls in this case.

Yes, I agree. This code snippet at nvmf_connect_disc_entry() is really unwarranted IMO. But anyways...

Something like this here:

	if (e->trtype == NVMF_TRTYPE_TCP &&
	    e->tsas.tcp.sectype != NVMF_TCP_SECTYPE_NONE &&
	    !cfg->tls)
		c->cfg.tls = true;

If we check the command line for --concat and --tls and ensure that the user doesn't provide both options, then this here would be a good place to log what is happening. It seems relevant to report to me. What do you think?

This won't help because c->cfg.tls (or even c->cfg.concat for that matter) is always false at this point while parsing the discovery log page record at nvmf_connect_disc_entry(). IIRC, the user provided options for these flags are updated only in nvmf_add_ctrl() during the merge_config().

That's the reason I made this fix in nvmf_add_ctrl() itself in the above commit, and not in nvmf_connect_disc_entry().

@martin-gpy
Copy link
Contributor Author

Or should we just do away with the current code snippet of enabling tls if sectype is != NVMF_TCP_SECTYPE_NONE in nvmf_connect_disc_entry(), so that we completely rely on the user provided options itself for the same? That would be simple and straightforward. And then like you said, throw an additional error if tls and concat are invoked together.

What do you say?

@igaw
Copy link
Collaborator

igaw commented Jan 7, 2026

There was already some changes on this condition:

  • 3962a45 ("fabrics: add fabrics config option 'tls'")
  • 1f5db47 ("fabrics: use SECTYPE to determine whether to use TLS"

After reading up on the SECTYPE in the spec, this condition seems to implement the Host Configuration setting for TLS Required. So I don't think we can just drop it. Need to read more how the concat stuff is supposed to work together with TLS as next. Tricky stuff.

BTW, I think we should add a blktests for this scenario.

@hreinecke @calebsander

@martin-gpy
Copy link
Contributor Author

martin-gpy commented Jan 7, 2026

After reading up on the SECTYPE in the spec, this condition seems to implement the Host Configuration setting for TLS Required. So I don't think we can just drop it. Need to read more how the concat stuff is supposed to work together with TLS as next. Tricky stuff.

Then why not go with the current fix itself in the above commit at d1377a7? That way, we retain the existing check of default enabling --tls whenever sectype is != NVMF_TCP_SECTYPE_NONE in the discovery log page record, but at the same time ensuring --tls is turned off whenever --concat is invoked. That's also simple and straightforward and avoids all this confusion in the first place. After all, both --tls and --concat flags are used for creating TLS connections itself, but it's just that they correspond to mutually exclusive TLS features (--tls is used for configured PSK TLS whereas --concat is used for generated PSK TLS) with separate code paths.

@martin-gpy
Copy link
Contributor Author

I was digging more into this, and looks like there is a simple solution for this issue.

For configured PSK, the treq field in the discovery log page record is set to required whereas it is set to not required for generated PSK. So we can use that field to distinguish between the two here and set the appropriate flags for the same. Sample patch as follows:

        if (e->trtype == NVMF_TRTYPE_TCP &&
-           e->tsas.tcp.sectype != NVMF_TCP_SECTYPE_NONE)
-               c->cfg.tls = true;
+           e->tsas.tcp.sectype != NVMF_TCP_SECTYPE_NONE) {
+               if (e->treq == NVMF_TREQ_REQUIRED)
+                       c->cfg.tls = true;
+               else if (e->treq == NVMF_TREQ_NOT_REQUIRED)
+                       c->cfg.concat = true;
+       }

During nvme connect-all, if a discovery log page record reports the
sectype as anything other than NVMF_TCP_SECTYPE_NONE in
nvmf_connect_disc_entry(), it then assumes that --tls should be default
set for the same. But this holds true only for configured PSK TLS alone
and not for generated PSK TLS. For generated PSK TLS connections using
--concat (i.e. secure channel concat), this would lead to connection
failures since both --tls and --concat are not to be invoked together.

Fix this by distinguishing the two through their respective treq values
and setting the appropriate --tls or --concat flags for each.

Signed-off-by: Martin George <marting@netapp.com>
@martin-gpy
Copy link
Contributor Author

@hreinecke - Could you please share your comments on this?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

concat connections fail during connect-all

2 participants