Skip to content

Add splitlines? #19759

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
mpastell opened this issue Dec 29, 2016 · 9 comments
Closed

Add splitlines? #19759

mpastell opened this issue Dec 29, 2016 · 9 comments
Labels
strings "Strings!"

Comments

@mpastell
Copy link
Contributor

I think it would be useful to have a splitlines method similar to Python

Return a list of the lines in the string, breaking at line boundaries. Line breaks are not included in the resulting list unless keepends is given and true.

https://docs.python.org/3/library/stdtypes.html?highlight=splitlines .

I think it would be convenient, at least for me I very often process strings line by line. And it would be useful in order to avoid bugs related to \ron Windows.

@oscardssmith
Copy link
Member

Possibly more useful would be a way to iterate by line, as split lines only works well for small files. (Both would be cool too)

@mpastell
Copy link
Contributor Author

There is eachline, but it keeps the line endings and doesn't currently work for in memory strings. I think adding splitlines in addition to that would be useful. And give eachline the option not to include line breaks in the output.

@giordano
Copy link
Contributor

giordano commented Dec 30, 2016

What is the difference with split?

If you are only interested in removing leading and trailing whitespaces, there is also strip (and lstrip and rstrip).

@mpastell
Copy link
Contributor Author

The difference to to split is automatically handling \r\n and \n as linefeed. The first implementation that comes to mind would be:

split(string, r"\n|\r\n")

The corresponding function in Python also handles several other linefeed formats and it could be wise to do the same. I recently had few bugs in Weave.jl due to not handling \r(I just did split(s, "\n") ) and I suspect that others not using Julia on Windows regularly might do the same.

Using rstrip when parsing formats with significant whitespace is not a good idea. There is also chomp, but it doesn't remove trailing \r.

@giordano
Copy link
Contributor

giordano commented Jan 2, 2017

I'd say that there is no need for a new function (split is just fine), but to improve handling of newlines (and this is the topic of #19785), right? If so, this ticket can be closed.

@mpastell
Copy link
Contributor Author

mpastell commented Jan 2, 2017

I think splitlines contributes to improved handling of newlines and it would be nice to have it similarly as you have both print and println, readlines and eachline etc. Many languages have both split and splitlines e.g. Python, Ruby and Clojure.

@giordano
Copy link
Contributor

giordano commented Jan 2, 2017

I see your point now, thanks for clarifying it.

@mpastell
Copy link
Contributor Author

After #19944 gets merged this could simply be defined as

splitlines(str::String, chomp::Bool=true) = readlines(IOBuffer(str), chomp))

@kshyatt kshyatt added the strings "Strings!" label Jan 10, 2017
@laborg
Copy link
Contributor

laborg commented Feb 2, 2022

This issue can be closed: eachline and readline have a keep keyword since #20203 was merged.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
strings "Strings!"
Projects
None yet
Development

No branches or pull requests

6 participants