LwM2M string resource is not a zero terminated C-string according to the LwM2M specification #90719

GardeningStevie · 2025-05-28T08:14:48Z

I run into an issues with the current implementation of LwM2M String resource.

A string resource may be truncated with a zero-terminator when the buffer was used up entirely. In case of a UTF-8 multi-byte character as final symbol the result is an invalid UTF-8 string.

from lwm2m_engine_set(...)

	case LWM2M_RES_TYPE_STRING:
		if (len) {
			strncpy(data_ptr, value, len - 1);
			((char *)data_ptr)[len - 1] = '\0';
		} else {
			((char *)data_ptr)[0] = '\0';
		}
		break;

From the git history, the zero termination was always done in one or another way.

However, that is not in line with LwM2M specification which does specify that it is an UTF-8 string. In UTF-8, nul is a character like many other.

Many people, which are familiar with the C programming language, think that zero terminated strings are common sense. In reality, it is a concept of the the C programming language.

So, the current implementation which enforces zero-termination may cause compatibility issues with other LwM2M implementations. At least, any unit test should fail that do not get what he was set. That's the case with the current implementation.

My solution proposal
Since the datalen field is introduced, a String resource can be handled like Opaque. So, don't add or remove any character in the resource itself.

lwm2m_get_string() and lwm2m_set_string() can do the C string <-> LwM2M string conversion by adding and removing the C string zero terminator.

Actually, lwm2m_get_string() and lwm2m_get_opaque() functions lack also the information about the actual resource length. So, I added a parameter to get the information about the resource size.
So, the API of these functions breaks compatibility. Anyway, lwm2m_get_opaque() had no useful purpose without that information. lwm2m_get_string() adds now the C zero-termination but requires a buffer that can
take that additional character.

The first commit is too big. I cleaned some code that hurts my eyes. It was not really necessary.

However, any thoughts about the basic topic?

It's against LwM2M specification which does not specify zero-termination for a string resource. It's not required and the current implementation makes a terminating UTF-8 multibyte character invalid and modifies any string which use up the entire assigned buffer without any notice. It may also break compatibility with other LwM2M implementations. This commit fixes `lwm2m_get_string()` and `lwm2m_get_opaque()` implementation which could not return the copied data length. So, the API of these functions breaks compatiblity. However, `lwm2m_get_opaque` had no useful purpose without the information about the copied size. `lwm2m_get_string` still adds zero-termination but requires a buffer that can take the zero-termination character. To fix the implementation, `lwm2m_engine_get()` was modified in way it returns the copied data length. So, a return code >= 0 is okay now. While editing `lwm2m_engine_get()` some "side quests" arise. - `memcpy` works for everything except a LwM2M Time resource. So, a lot of code was dropped. - reading from a null pointer can return `success`. - `read_cb()` unittest implementation returns an error (null pointer). Signed-off-by: Stefan Schwendeler <Stefan.Schwendeler@husqvarnagroup.com>

A read callback can return a null pointer to inidicate an error. `lwm_engine_get()` will then return -ENOENT as response. Signed-off-by: Stefan Schwendeler <Stefan.Schwendeler@husqvarnagroup.com>

Verifies that the memory copy works as expected - return codes are correct - resource length parameter works Signed-off-by: Stefan Schwendeler <Stefan.Schwendeler@husqvarnagroup.com>

sonarqubecloud · 2025-06-02T11:34:41Z

Quality Gate passed

Issues
0 New issues
0 Accepted issues

Measures
0 Security Hotspots
0.0% Coverage on New Code
1.3% Duplication on New Code

See analysis details on SonarQube Cloud

rlubos

Looks reasonable I think. Thanks for adding extra tests!

rlubos · 2025-07-18T10:22:48Z

subsys/net/lib/lwm2m/lwm2m_registry.c

 		data_ptr = res->read_cb(obj_inst->obj_inst_id, res->res_id, res_inst->res_inst_id,
 					&data_len);
 	}

-	if (data_ptr && data_len > 0) {
+	if (data_ptr == NULL) {


While the change makes sense, IMHO it'd be better to put it in a separate commit for clarity, the fix isn't really related to the string representation cleanup.

rlubos · 2025-07-18T10:27:23Z

include/zephyr/net/lwm2m.h

+int lwm2m_get_opaque(const struct lwm2m_obj_path *path, void *buf, uint16_t buflen,
+		     uint16_t *reslen);


Kind of weird we had function like this, how in the world would the user know how large the actual data content is in the buffer.

But anayway, it's still an API change - LwM2M API is unstable, but it should be covered in the migration guide for 4.3, please add an entry, can be in a separate commit.

SeppoTakalo · 2025-07-18T12:40:32Z

I fail to see how you can produce truncated stings into LwM2M engine.
You should get -ENOMEM if the buffer is too small.

static int lwm2m_check_buf_sizes(uint8_t data_type, uint16_t resource_length, uint16_t buf_length)
{
	switch (data_type) {
	case LWM2M_RES_TYPE_OPAQUE:
	case LWM2M_RES_TYPE_STRING:
		if (resource_length > buf_length) {
			return -ENOMEM;
		}
		break;

SeppoTakalo · 2025-07-18T12:55:27Z

By the way, referring to the LwM2M specification on this case does not apply.

LwM2M specifies the network protocol, nothing about the internal representation in the engine or the C API of the client.

So if something is not encoded correctly according to LwM2M spec, the content-type encoder/decoder needs to be fixed. We should not touch the API because of that.

GardeningStevie force-pushed the gardena/sc/upstream/lwm2m-invalid-utf8-string-due-to-forced-zero-termination branch 4 times, most recently from acc6a5c to f27a77d Compare June 2, 2025 09:53

GardeningStevie added 3 commits June 2, 2025 13:15

tests: net: lib: lwm2m: adds read callback with error response

005b49d

A read callback can return a null pointer to inidicate an error. `lwm_engine_get()` will then return -ENOENT as response. Signed-off-by: Stefan Schwendeler <Stefan.Schwendeler@husqvarnagroup.com>

tests: net: lib: lwm2m: adds test case for lwm2m_set/get_opaque

88c70cf

Verifies that the memory copy works as expected - return codes are correct - resource length parameter works Signed-off-by: Stefan Schwendeler <Stefan.Schwendeler@husqvarnagroup.com>

GardeningStevie force-pushed the gardena/sc/upstream/lwm2m-invalid-utf8-string-due-to-forced-zero-termination branch from f27a77d to 88c70cf Compare June 2, 2025 11:16

GardeningStevie marked this pull request as ready for review July 18, 2025 09:42

zephyrbot added area: LWM2M area: Networking labels Jul 18, 2025

zephyrbot requested review from jukkar, pdgendt, rlubos, SeppoTakalo and ssharks July 18, 2025 09:43

zephyrbot assigned rlubos and jukkar Jul 18, 2025

rlubos reviewed Jul 18, 2025

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

LwM2M string resource is not a zero terminated C-string according to the LwM2M specification #90719

LwM2M string resource is not a zero terminated C-string according to the LwM2M specification #90719

GardeningStevie commented May 28, 2025

Uh oh!

sonarqubecloud bot commented Jun 2, 2025

Uh oh!

rlubos left a comment

Uh oh!

rlubos Jul 18, 2025

Uh oh!

rlubos Jul 18, 2025

Uh oh!

SeppoTakalo commented Jul 18, 2025

Uh oh!

SeppoTakalo commented Jul 18, 2025

Uh oh!

Uh oh!

		int lwm2m_get_opaque(const struct lwm2m_obj_path path, void buf, uint16_t buflen,
		uint16_t *reslen);

LwM2M string resource is not a zero terminated C-string according to the LwM2M specification #90719

Are you sure you want to change the base?

LwM2M string resource is not a zero terminated C-string according to the LwM2M specification #90719

Conversation

GardeningStevie commented May 28, 2025

Uh oh!

sonarqubecloud bot commented Jun 2, 2025

Quality Gate passed

Uh oh!

rlubos left a comment

Choose a reason for hiding this comment

Uh oh!

rlubos Jul 18, 2025

Choose a reason for hiding this comment

Uh oh!

rlubos Jul 18, 2025

Choose a reason for hiding this comment

Uh oh!

SeppoTakalo commented Jul 18, 2025

Uh oh!

SeppoTakalo commented Jul 18, 2025

Uh oh!

Uh oh!