Vectorized functions for strings operations

UCO · November 18, 2025, 2:40pm

Hi dear scilab users,

I came across something that bothers me. I wonder if there is no function to easily do what I want, or it is just that I can’t find a simple way to do so.

I have ̶a̶n̶ ̶m̶l̶i̶s̶t̶ a table where one column is filled with strings, and some codes are encapsulated in these strings. Something like that:
[546] code 1
[007] code 2
[89] code 3
[546] code 1
…

The thing is my file is quite large (500000+ lines), and I would like to extract these codes vectorially, without the use of a loop on the array, using directly some functions. Because strsplit and some other functions seem to work only on a single string.

Is there a way to do so, or is it not implemented ?

Thank you all !

mottelet · November 18, 2025, 3:39pm

Hello, and welcome to Scilab’s Discourse !

Can you just post a small example ?

S.

UCO · November 19, 2025, 9:00am

Hello,
It feels nice to be answered !
I made some little code to clarify the situation:

function M = create_dumb_table(n_elements)
	Date = hours(1:n_elements)'
	code = sample(n_elements, ["[12] Hazard", "[7] Emergency", "[131] Brake", "[999] miscellaneous code", "[56] Future maintenance required", "[4] OK", "[0] N\A", "[33311] lunch break"])'
	M=table(Date, code, "VariableNames", ["Date" "code"]);
endfunction

M = create_dumb_table(30) // much more complex in my case but the core is here

// I would like to do something like
[splits, _] = strsplit(M("code"), "]");
M("code extract") = uint16(part(splits(1,:), 2:$)); // or strsplit(splits(1,:), "["); ...

// Which I can't, because strsplit doesn't seem to allow vectors of strings, and I didn't find any convenient method

// So I ended up with something like that, which is very slow on large arrays

function list_floats = find_floats_in_brackets(str_array)
	n = size(str_array,1)
	list_floats = zeros(n,1)
	for i = 1: n
		[str_number, _] = strsplit(str_array(i), "]")
		str_number = part(str_number(1), 2:$)
		list_floats(i) = uint16(str_number)
	end
endfunction

M("code extract") = find_floats_in_brackets(M("code"));

Is there a more practical way to do so ?
I had thought about using strchr to get the length of the string to be cut, but then the part function works on vectors, but doesn’t seem to allow flexible length cuts on the rows of the array string.

little bonus question : here my codes are integers, if they were floats, would I be forced to use evstr ? Because these are only floats, without calculations in it, maybe there is something faster.

Thank you in advance !

UCO

mottelet · November 19, 2025, 9:56am

Basics are always your best friends:

msscanf(-1,M.code,"[%d]")

 ans = [30x1 double]

   56.
   0.
   0.
   7.
   56.
   33311.
   4.
   7.
   131.
   999.
   999.
   4.
   56.
   131.
   33311.
   12.
   33311.
   56.
   7.
   131.
   7.
   7.
   999.
   33311.
   131.
   131.
   12.
   7.
   56.
   999.

On my macBook the timing seems OK for a large table:

--> M = create_dumb_table(500000);

--> tic;

--> i = msscanf(-1,M.code,"[%d]");

--> toc

 ans = 

   0.0657260

UCO · November 19, 2025, 11:54am

Thank you you have completely answered my question. I was looking for functions inside the strings help documentation, I didn’t think to look somewhere else.

I must admit I still struggle because in reality there are exceptions in my file, sometimes there are no brackets at all, and sometimes it’s text that is inside the brackets. It is very rare, but it happens and the msscanf stops at first discrepancy. I will try to figure out how to handle these.

davcheze · November 24, 2025, 8:59am

Hello,

I feel that it’s like to match, extract and handle separately the different cases in your vector of strings : looking around grep with regular expression to match as needle in the haystack.

best,

David

Topic		Replies	Views
Replace obsolete function "str2code" Development	4	135	January 22, 2024
How can I select parts of a list of vectors? General usage	7	99	October 16, 2024
Timeseries/table - Apply filter on columns Data handling	7	94	October 3, 2024
Scan matrix from clipboard instead of file Data handling	2	127	January 3, 2024
Reading a mat-file whichs includes a table variable Matlab to Scilab	1	71	May 16, 2025

Vectorized functions for strings operations

Related topics