You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
* Initial stub
* markdown
* code style
* Memory management
* fix invalid link
* Info on Criterion
* tests, stub on random
* wording
* Assertions
* preloaded
* random utilities
* example test suite
* example test suite
* todo: not on size of returned array
* note on input mutation
* initial version, checklist
* Naive approach
* fix markdown
* Organization
* Preallocated buffer
* Query + allocation + calculation
* Solution returns status
* Symmetric user functions, organization
* 2D arrays: flat
* Hide one of paragraphs because it's probably too complex.
* 2D arrays
* typos, wording
* Apply suggestions from code review
Co-authored-by: Greg Gorlen <gsgorlen@gmail.com>
* Apply suggestions from code review
Co-authored-by: Donald Sebastian Leung <donaldsebleung@gmail.com>
* Apply suggestions from code review
Co-authored-by: Steffan <40404519+Steffan153@users.noreply.github.com>
* Organization
* Organization
* constants
* Memory allocated by tests
* Memory managed by the user
* 2d arrays
* Note on casts of linear buffers
* Note on cast between 1d and 2d
* Array of constants
* Headers of examples
* Apply suggestions from code review
Co-authored-by: Steffan <40404519+Steffan153@users.noreply.github.com>
* Apply suggestions from code review
Co-authored-by: Steffan <40404519+Steffan153@users.noreply.github.com>
* Fix example section
* Apply suggestions from code review
Co-authored-by: Donald Sebastian Leung <donaldsebleung@gmail.com>
* Apply suggestions from code review
* Remove string constants, replace them with enums
* Apply suggestions from code review
Co-authored-by: Steffan <40404519+Steffan153@users.noreply.github.com>
* Apply suggestions from code review
Co-authored-by: Donald Sebastian Leung <donaldsebleung@gmail.com>
Co-authored-by: Greg Gorlen <gsgorlen@gmail.com>
Co-authored-by: Donald Sebastian Leung <donaldsebleung@gmail.com>
Co-authored-by: Steffan <40404519+Steffan153@users.noreply.github.com>
Unlike many modern, high-level languages, C does not manage memory automatically. Manual memory management is a very vast and complex topic, with many possible ways of achieving the goal depending on a specific case, caveats, and pitfalls.
10
+
11
+
12
+
## General information
13
+
14
+
### Specification
15
+
16
+
Whenever a kata passes in a pointer to the user's solution or requires it to return or manipulate a pointer or data referenced by a pointer, it should **explicitly** and **clearly** provide all information necessary to carry out the operation correctly. See the paragraph on [related guidelines](/languages/c/authoring/#working-with-pointers-and-memory-management) in ["C: creating and translating a kata"](/languages/c/authoring/) tutorial.
17
+
When the structure, layout, or allocation scheme of pointed data is not described, users cannot know how to implement requirements without causing either a crash or a memory leak.
18
+
19
+
20
+
### Interface
21
+
22
+
It often happens that the solution function has to accept and return more values than just these related to the kata task itself. There can be more parameters required for tracking the memory, sizes of allocated buffers, statuses, etc. Depending on exact requirements, these parameters can be passed in and returned as separate function arguments, or can be packed together into some kind of structure. Examples in this article assume the former, but authors are free to decide otherwise.
23
+
24
+
25
+
### Arrays and strings
26
+
27
+
Since C-strings and arrays of other types are similar from the perspective of memory management, most of the techniques presented here apply equally to handling memory holding integers, floats, and characters, zero-terminated or not.
28
+
29
+
30
+
## Memory Management Patterns
31
+
32
+
In C, unlike for example Python, Java, C#, or Javascript, dynamically allocated memory is not managed by the runtime. It's considered to be a resource like any other, for example, a file, a DB or network connection, or a hardware device. The program itself has to take care of it properly, allocating it when necessary, and freeing when no longer needed.
33
+
34
+
In kata, the memory can be managed either by the test suite, by the user, or both. Authors can choose the way how their kata should deal with memory and they can pick any ownership strategy. However, they should be aware of the advantages and disadvantages of each such strategy, and when and which applies the best.
35
+
36
+
37
+
### Statically allocated constant data
38
+
39
+
The best way to prevent problems with memory allocation is to avoid unnecessary memory allocation. This advice might sound tricky, but there are simply many kata that require dynamic memory allocation or operation on data pointed by pointers, while it's simply not necessary and could be avoided. One commonly occurring example of such a situation is when a kata requires returning a pointer to a string which could be replaced by a constant. It seems to appear particularly often when translating kata from other languages. Returning a string in high-level languages is not a problem, but in C it always raises questions of who should allocate it and how it should be allocated. Consider replacing the string with an `enum`. For example, if the requirement for the JavaScript version is: _"Return the string 'BLACK' if a black pawn will be captured first, 'WHITE' if a white one, and 'NONE' if all pawns are safe."_, the C version should preferably provide and use the named constants `BLACK`, `WHITE`, and `NONE`.
40
+
41
+
<details>
42
+
<summary>Example</summary>
43
+
44
+
Solution:
45
+
46
+
```c
47
+
//Since Codewars does not allow header files for kata, declarations need to be repeated
48
+
//This definition has to be provided by the solution stub snippet.
49
+
typedefenum Player { BLACK, WHITE, NONE } Player;
50
+
51
+
Player who_won(const char* board) //typedef used for return type
52
+
{
53
+
if(...) {
54
+
return BLACK; //return constant instead of an allocated string
55
+
} else if (...) {
56
+
return WHITE;
57
+
} else {
58
+
return NONE;
59
+
}
60
+
}
61
+
```
62
+
63
+
Tests:
64
+
65
+
```c
66
+
//Since Codewars does not allow header files for kata, declarations need to be repeated
67
+
typedef enum Player { BLACK, WHITE, NONE } Player;
//remember to turn enum values into strings to get better error messages
82
+
cr_assert_eq(winner, NONE, "Expected: [%s], but was: [%s]", stringify(NONE), stringify(winner));
83
+
}
84
+
```
85
+
86
+
</details>
87
+
88
+
89
+
### Memory managed by tests (i.e. caller)
90
+
91
+
One set of possible techniques assumes that the caller is the owner of allocated memory and tests should be responsible for allocating and releasing it. Memory is always allocated by the test suite, and the test suite can decide whether it wants to use memory allocated automatically (i.e. on the stack), dynamically (for example with `malloc`), or in some other available way. The test suite is also responsible for releasing it, if necessary. Such allocated buffer is passed to the user's solution to work on, and it's filled with the requested data.
92
+
93
+
Sometimes it's perfectly known how large the result will be before the solution is called, or it's possible to pre-allocate a buffer that will be large enough for every call. For example, if the test suite asks to generate `n` Fibonacci numbers, it means that the resulting array needs to have the size of at least `n`. Sometimes the exact size is not known exactly, but it's possible to accurately estimate its upper bound. For example, a function that removes punctuation from a string needs to work on a buffer at least as large as an input string, but the result can turn out to be a bit smaller. In such cases, the test suite can allocate the buffer which would be big enough to keep the result, and pass it to the solution function:
94
+
95
+
<details>
96
+
<summary>Example</summary>
97
+
98
+
Solution:
99
+
100
+
```c
101
+
//function prototype can use size hints
102
+
voidcalculate_numbers(size_t n, int result [n]) {
103
+
//...actual calculations
104
+
}
105
+
```
106
+
107
+
Tests:
108
+
109
+
```c
110
+
void calculate_numbers(size_t n, int result [n]);
111
+
Test(fixed_tests, small_inputs) {
112
+
113
+
//requested amount of nubers
114
+
const int to_generate = 4;
115
+
116
+
//array allocated on stack,
117
+
//the required size is perfectly known
118
+
int result_array[to_generate];
119
+
120
+
//pass the array to the function, and expect
121
+
//it to be filled with the result
122
+
calculate_numbers(to_generate, result_array);
123
+
124
+
//...perform assertions, verify correctness of returned numbers...
125
+
126
+
//no need to deallocate the array
127
+
}
128
+
129
+
Test(random_tests, large_inputs) {
130
+
131
+
const int MAX_TEST = 10000000;
132
+
133
+
//dynamically allocate an array large enough to fit all possible answers.
134
+
//allocate it once, and reuse it through the tests.
135
+
int* array = malloc(sizeof(int) * MAX_TEST);
136
+
137
+
//ten random tests
138
+
for(int i=0; i<10; ++i) {
139
+
140
+
//randomize the input
141
+
int n = rand() % MAX_TEST + 1;
142
+
143
+
//use preallocated array
144
+
calculate_numbers(n, array);
145
+
146
+
//...perform assertions, verify correctness of returned numbers...
147
+
}
148
+
149
+
//release the memory after all tests
150
+
free(array);
151
+
}
152
+
```
153
+
154
+
</details>
155
+
156
+
This technique is often overlooked by kata authors, but it greatly simplifies the way how user solutions are built and how they communicate with the test suite. The user's solution does not have to worry about allocations or error handling, and can focus on its task. The test suite can use any allocation technique it wants, like automatic allocation on the stack, or dynamic allocation on a heap. Buffers can be allocated once and reused across many test calls.
157
+
158
+
The biggest problem with allocated memory is that its size has to be known or possible to estimate before calling the user's solution. It's very often the case, but sometimes such estimation is not possible or easy. There are ways to work around this problem and work with memory allocated by the caller even when its size is not known upfront, but they are out of the scope of this article. In such cases, kata can use memory allocated by the user.
159
+
160
+
161
+
### Mixed approach: `malloc` in the solution and `free` in tests
162
+
163
+
In the vast majority of cases when a kata requires the solution to allocate memory, authors choose the naive approach of allocating the memory in the solution, and releasing it with `free` in the test suite after performing all necessary assertions. This mimics the behavior known from high-level languages where returning an array or object from inside of the user's solution is perfectly valid, but it's not always the best, or even correct, way of working with unmanaged memory in C.
164
+
165
+
This approach is useful when the size of the result is not known before the call. The solution is responsible for finding the correct size and returning it along with the pointer to the buffer itself, and the test suite is responsible for freeing it after every call.
166
+
167
+
<details>
168
+
<summary>Example</summary>
169
+
170
+
Kata task:
171
+
172
+
> Given a natural number `n`, return all prime numbers up to and including `n`.
173
+
174
+
Solution:
175
+
176
+
```c
177
+
//get all prime numbers less than upto
178
+
//use an output parameter to return the size of the result
179
+
int* get_primes(int upto, int* size) {
180
+
181
+
//the solution allocates required memory
182
+
int* result = malloc(sizeof(int) * ...);
183
+
184
+
//... fill result with primes
185
+
//...
186
+
187
+
*size = ...; //assign amount of primes
188
+
return result;
189
+
}
190
+
```
191
+
192
+
Test suite:
193
+
194
+
```c
195
+
Test(fixed_tests, should_return_2_and_3_for_4) {
196
+
197
+
int expected[] {2, 3}, expected_size = 2;
198
+
int actual_size;
199
+
200
+
//call user solution and expect it to allocate the returned array
201
+
int* actual = get_primes(4, &actual_size);
202
+
203
+
//...assert on actual_size
204
+
//...assert on contents of actual
205
+
206
+
//after performing all necessary assertions,
207
+
//free the array allocated by the user solution
208
+
free(actual);
209
+
}
210
+
```
211
+
212
+
</details>
213
+
214
+
This approach works in a way similar to functions like `strdup` or `asprintf`, which allocate required memory and pass its ownership to the caller. It's a good fit for Codewars kata because it's simple, effective, and works well in Codewars' code runner.
215
+
216
+
A potential issue with the mixed approach is not related to Codewars, but to "real world" C programming and design. It might not work well for complex memory structures, or when a callee has to do advanced book-keeping and tracking of allocated memory. It also does not work well when passing data between modules (for example, between libraries, or from a library to the main program).
217
+
218
+
219
+
### Memory managed by the solution
220
+
221
+
The opposite of managing memory in the test suite is the approach of delegating the responsibility to the solver. This way, tests do not need to worry about problematic aspects of memory management, kata authors give freedom of implementation to users, and can reduce the boilerplate required to implement memory management.
222
+
223
+
This idea boils down to asking users to provide their equivalents of allocation and de-allocation functions. The solution function is responsible not only for solving the task but also for allocation of memory and storing of book-keeping information. The clean-up function is responsible for releasing resources.
224
+
225
+
There are many possible ways of implementing the allocation scheme and corresponding clean-up function, but an example implementation could be:
226
+
227
+
<details>
228
+
<summary>Example</summary>
229
+
230
+
Kata task:
231
+
232
+
> Given the initial generation of a Game of Life population, return the state and size of the game world after `n` generations.
233
+
234
+
Solution:
235
+
236
+
```c
237
+
//solution function, which allocates all required memory and solves the task
char** world = ...; //allocating memory for the world map
241
+
242
+
for(int i=0 i < generations; ++i) {
243
+
//... actual game, which potentially requires additional (re)allocations
244
+
}
245
+
246
+
//return the final state of the game world to the caller
247
+
return world;
248
+
}
249
+
250
+
//clean-up function
251
+
void destroy_world(char** world) {
252
+
//... deallocate all memory appropriately in a way
253
+
//which matches how the game_of_life allocated it.
254
+
}
255
+
```
256
+
257
+
Tests:
258
+
259
+
```c
260
+
int world_w = 3, world_h = 3;
261
+
char** initial_generation = ...; //set up a GoL glider
262
+
int generations = 25;
263
+
264
+
//invoke solution function, which also allocates memory
265
+
char** actual = game_of_life(generations, initial_generation, &world_h, &world_w);
266
+
267
+
//... perform assertions on the world map and verify the state of its cells
268
+
269
+
//call the clean-up function, which deallocates all memory
270
+
destroy_world(actual);
271
+
272
+
//...at this point memory is deallocated, no need to call free
273
+
```
274
+
275
+
</details>
276
+
277
+
Memory management by a callee is not a common requirement for Codewars kata. It can be useful when the memory is structured in a complex way, or when it has to be tracked in some particular way. It mimics the behavior of C libraries, which often provide symmetrical de/allocation functions, and/or use opaque pointers as elements of their interface.
10
278
11
-
## Arrays and strings
12
279
13
-
- malloc in solution and free in tests
14
-
- pass in a preallocated buffer (use size hints if possible)
15
-
- two functions: get size, allocate in tests, run solution
16
-
- two functions: solution with allocation, deallocation. Bookkeeping information managed by user or passed as additional `void*`
17
-
- one function: accept buffer+size, return retsult or error and required size
Some kata require the user solution to return a two-dimensional array, for example, a 2D matrix, or an array of C-strings. Such scenarios are a bit more complex, because not only does the higher-order array have to be properly managed, but all its individual entries as well. The exact approach selected for the allocation of such structures depends on the scenario because different techniques are suitable for square or rectangular arrays, jagged arrays, arrays of null-terminated strings, etc.
283
+
284
+
285
+
Just as any form of memory, 2D arrays can be managed by the test suite, user solution, or both. As long as the size of the 2D array is known before calling a solution and does not change through the course of calculations, the test suite can choose to perform all necessary allocations and pass the memory to the solution function ready to use. This is a very good approach when working with chessboards, sudokus, matrices and mazes of predetermined sizes, etc. However, in the case that the size of the answer cannot be easily determined beforehand, the mixed approach or memory management by the callee with a clean-up function provided by the user can be better.
286
+
287
+
:::note Note on examples
288
+
For simplicity, this section uses the terms "2D array", "array of arrays", and "matrix" interchangeably and assumes row-major order, i.e. data can be accessed with `array[row][col]`.
289
+
:::
290
+
291
+
### Naive approach: N+1 allocations
292
+
293
+
This is the most common approach of using dynamically allocated multidimensional arrays. An array of pointers to rows is allocated first, and each row is allocated individually afterwards.
294
+
295
+
<details>
296
+
<summary>Example</summary>
297
+
298
+
Allocation:
299
+
300
+
```c
301
+
//allocate array of rows first
302
+
char** world = malloc(sizeof(char*) * world_h);
303
+
for(int i=0; i < world_h; ++i) {
304
+
305
+
//allocate every row individually
306
+
world[i] = malloc(world_w);
307
+
}
308
+
```
309
+
310
+
Deallocation:
311
+
312
+
```c
313
+
//... deallocate all memory also row by row
314
+
for(int i=0; i < world_h; ++i)
315
+
free(world[i]);
316
+
317
+
free(world);
318
+
}
319
+
```
320
+
321
+
</details>
322
+
323
+
The advantage of individually allocated rows is that it works well for jagged arrays.
324
+
325
+
This approach, despite appearing to be simple, is affected by issues mostly related to performance. It tends to be slow since each dynamic allocation requires a memory lookup. It can also cause excessive memory fragmentation.
326
+
327
+
Additionally, it is sometimes unnecessarily used to return an array of data (usually strings) that could be turned into constants.
328
+
329
+
330
+
### Array of string constants
331
+
332
+
This approach is related to [returning a statically allocated const data](#statically-allocated-constant-data) but extended to arrays. Some kata require the user to return an array of strings, which could be turned into constants. In such a case, string constants should be replaced with `enum`, and just a one-dimensional, dynamically allocated array of enum values should be used.
333
+
334
+
The only tricky part is stringification of the values if they are going to be displayed or used as a part of assertion messages.
335
+
336
+
337
+
### Flat array
338
+
339
+
Very often overlooked, but a very good approach to represent 2D arrays is to store them in a regular, linear array of `T[ ]`, potentially supported by some type casts between a linear buffer and two-dimensional matrix.
340
+
341
+
<details>
342
+
<summary>Example</summary>
343
+
344
+
```c
345
+
//declaration of solution accepting a two-dimentional array
world_linear[row * world_w + col] = 'x'; //set a cell as alive
364
+
365
+
//access a cell in 2d array
366
+
world_2d[row][col] = ' '; //set a cell as dead
367
+
}
368
+
}
369
+
370
+
//pass the 2d array to user solution
371
+
play_game_of_life(world_h, world_w, world_2d);
372
+
373
+
//deallocate all memory at once
374
+
free(world_linear);
375
+
```
376
+
377
+
</details>
378
+
379
+
This way, the complexity of memory management is greatly reduced since all necessary memory can be allocated and freed with a single call to `malloc` (or equivalent) and `free`.
380
+
381
+
The drawback of the version with casts between linear and 2D arrays is that it is best suited for perfectly rectangular arrays, i.e. arrays whose sub-arrays all have equal length. However, the version without casts can be effectively used when bounds between inner arrays can be efficiently determined, for example, each row of a matrix has a well-known length, rows of a Pascal's triangle have precisely defined, although different, lengths, and string entries are clearly terminated.
382
+
383
+
This method also does not fit perfectly the scenario when such an array should be *returned* from a function. The function still has to specify its return type as `T*`, and the caller has to either work with the linear form of the array or perform the cast on its own.
0 commit comments