Clojure

Mixed end-of-line endings in the source code

Details

  • Type: Enhancement Enhancement
  • Status: Open Open
  • Priority: Minor Minor
  • Resolution: Unresolved
  • Affects Version/s: None
  • Fix Version/s: None
  • Component/s: None
  • Labels:
    None
  • Patch:
    Code
  • Approval:
    Triaged

Description

Some source files in Clojure have the DOS/Windows carriage return/line feed line endings, whereas the majority use the Linux/Unix line feed only line endings.

The main ongoing issue with this is that if there is a correctly formatted git patch for such a file, then using the recommended 'git am' command to apply such a patch always gives warnings. Some screeners see such warnings, and immediately disregard the patch, asking for the warnings to be fixed before they will consider it. As far as I can tell, this is impossible. Thus patches modifying files with CR/LF line endings are nearly in a deadlock state, requiring unusual effort from non-screeners to remind screeners that such patches are as good as they can be, and would they please ignore those warnings because there is no way around them.

If someone can find a way to apply git patches to such files without such warnings, please update this description or add a comment.

Stu Halloway asks: Wouldn't eliminating all CR/LF endings in Clojure source files invalidate many unapplied patches? My answer: Please do it anyway. Others will step in to fix those patches, if they become broken.

The attached file clj-1026-bash-script.sh does several things, which should be easy to verify by reading through the script itself, which is quite simple:

1. find all files with CR/LF endings, and remove the CR characters.
2. Does 'git status .' to show the files that have been modified
3. Does 'git diff .' and 'git diff -w .', showing the number of lines that have been modified, and the number of lines that have been modified in a way that changes something besides whitespace, which should be exactly 0 lines, because only whitespace has been changed.

Original description when this ticket was created:

While examining some of the Clojure source code, I discovered that some files had mixed line endings, or CRLF line endings on a non-Windows box. Using .gitattributes, we can correct that so that files have the right endings for the platform that it's on.

Activity

Hide
John Szakmeister added a comment -

Patch to fix line endings and introduce .gitattributes.

Show
John Szakmeister added a comment - Patch to fix line endings and introduce .gitattributes.
John Szakmeister made changes -
Field Original Value New Value
Attachment 0001-Introduce-end-of-line-normalization.patch [ 11384 ]
Hide
Stuart Halloway added a comment -

This looks like a change to every line in the world, which makes it hard to vet. Also: will it render incompatible all other outstanding patches at the time it is applied?

Show
Stuart Halloway added a comment - This looks like a change to every line in the world, which makes it hard to vet. Also: will it render incompatible all other outstanding patches at the time it is applied?
Hide
John Szakmeister added a comment - - edited

You can use git diff -w $(git merge-base HEAD master) to see the actual diff minus the line ending change. Here it is inline:

:: git diff -w $(git merge-base HEAD master)
diff --git a/.gitattributes b/.gitattributes
new file mode 100644
index 0000000..7b89cfa
--- /dev/null
+++ b/.gitattributes
@@ -0,0 +1,6 @@
+*.txt           text
+*.clj           text
+*.html          text
+*.js            text
+*.css           text
+*.java          text diff=java

Also, no, it won't render all the outstanding patches incompatible. For one, it seems like the files that have the eols mixed or in CRLF aren't touched much. I also went through the majority of outstanding patches and couldn't fix one that conflicts. Secondly, format-patch emits the patch inline and is intended to be sent via email. SMTP says that all lines must end with CRLF, so line endings are effectively lost. So git will convert it to the right line ending on application.

It can conflict with any outstanding branches that you may have. Supplying the renormalization option on merge can help:

git merge -X renormalize <branch-name>

Or, you can enable this by default for your repository:

git config --local merge.renormalize true

If you think it's too much trouble, let's just drop it though.

Show
John Szakmeister added a comment - - edited You can use git diff -w $(git merge-base HEAD master) to see the actual diff minus the line ending change. Here it is inline:
:: git diff -w $(git merge-base HEAD master)
diff --git a/.gitattributes b/.gitattributes
new file mode 100644
index 0000000..7b89cfa
--- /dev/null
+++ b/.gitattributes
@@ -0,0 +1,6 @@
+*.txt           text
+*.clj           text
+*.html          text
+*.js            text
+*.css           text
+*.java          text diff=java
Also, no, it won't render all the outstanding patches incompatible. For one, it seems like the files that have the eols mixed or in CRLF aren't touched much. I also went through the majority of outstanding patches and couldn't fix one that conflicts. Secondly, format-patch emits the patch inline and is intended to be sent via email. SMTP says that all lines must end with CRLF, so line endings are effectively lost. So git will convert it to the right line ending on application. It can conflict with any outstanding branches that you may have. Supplying the renormalization option on merge can help:
git merge -X renormalize <branch-name>
Or, you can enable this by default for your repository:
git config --local merge.renormalize true
If you think it's too much trouble, let's just drop it though.
Hide
Stuart Halloway added a comment -

Patch does not apply on my working copy of Clojure. I am using

git am -s ...
/Users/stu/repos/clojure/.git/rebase-apply/patch:344: trailing whitespace.
  p {  	
/Users/stu/repos/clojure/.git/rebase-apply/patch:350: space before tab in indent.
  	margin-left: 0.5in;
error: patch failed: epl-v10.html:1
error: epl-v10.html: patch does not apply
error: patch failed: src/jvm/clojure/asm/AnnotationVisitor.java:1
error: src/jvm/clojure/asm/AnnotationVisitor.java: patch does not apply
error: patch failed: src/jvm/clojure/asm/AnnotationWriter.java:1
error: src/jvm/clojure/asm/AnnotationWriter.java: patch does not apply

I am willing to do this, just inept.

Show
Stuart Halloway added a comment - Patch does not apply on my working copy of Clojure. I am using
git am -s ...
/Users/stu/repos/clojure/.git/rebase-apply/patch:344: trailing whitespace.
  p {  	
/Users/stu/repos/clojure/.git/rebase-apply/patch:350: space before tab in indent.
  	margin-left: 0.5in;
error: patch failed: epl-v10.html:1
error: epl-v10.html: patch does not apply
error: patch failed: src/jvm/clojure/asm/AnnotationVisitor.java:1
error: src/jvm/clojure/asm/AnnotationVisitor.java: patch does not apply
error: patch failed: src/jvm/clojure/asm/AnnotationWriter.java:1
error: src/jvm/clojure/asm/AnnotationWriter.java: patch does not apply
I am willing to do this, just inept.
Hide
Andy Fingerhut added a comment -

Stuart, I updated this page http://dev.clojure.org/display/design/JIRA+workflow a while back when I had trouble applying some patches involving files with carriage return line endings. I did some Googling on git docs and found the option '--keep-cr' that seems to help in such cases. Try that out.

Show
Andy Fingerhut added a comment - Stuart, I updated this page http://dev.clojure.org/display/design/JIRA+workflow a while back when I had trouble applying some patches involving files with carriage return line endings. I did some Googling on git docs and found the option '--keep-cr' that seems to help in such cases. Try that out.
Andy Fingerhut made changes -
Patch Code [ 10001 ]
Stuart Halloway made changes -
Issue Type Defect [ 1 ] Enhancement [ 4 ]
Hide
Alex Miller added a comment -

I think in this case, we should not provide a patch but a process. And I would like to see this cleaned up.

Show
Alex Miller added a comment - I think in this case, we should not provide a patch but a process. And I would like to see this cleaned up.
Alex Miller made changes -
Approval Triaged [ 10120 ]
Hide
Andy Fingerhut added a comment -

I am not sure what kind of process you mean, Alex, but here are some thoughts.

There are some patches attached to tickets right now that modify Java source files that have CR/LF line endings throughout. Typically those patches remove some lines, and add new ones, and those new lines typically have CR/LF line endings, too, because whatever editor the person used while making the patch saw the file have CR/LF line endings, and preserved that on new lines of text they added to the file.

So if you want to get rid of CRs throughout the Clojure source code, and keep them out, one way would be to:

(1) do a big commit to get rid of all of the current CR characters in source files.

(2) update the few pending patches that introduce lines with CRs (I checked. There are only 15 tickets with such patches right now.)

(3) I can automate a process of detecting patches that have CRs and flag them for CR removal.

In fact (2) can be done as a result of (3).

Show
Andy Fingerhut added a comment - I am not sure what kind of process you mean, Alex, but here are some thoughts. There are some patches attached to tickets right now that modify Java source files that have CR/LF line endings throughout. Typically those patches remove some lines, and add new ones, and those new lines typically have CR/LF line endings, too, because whatever editor the person used while making the patch saw the file have CR/LF line endings, and preserved that on new lines of text they added to the file. So if you want to get rid of CRs throughout the Clojure source code, and keep them out, one way would be to: (1) do a big commit to get rid of all of the current CR characters in source files. (2) update the few pending patches that introduce lines with CRs (I checked. There are only 15 tickets with such patches right now.) (3) I can automate a process of detecting patches that have CRs and flag them for CR removal. In fact (2) can be done as a result of (3).
Hide
Alex Miller added a comment -

I just mean that any patch for this ticket is doomed to be continuously broken until it is applied, so it would be better to instead have a set of steps that someone can follow to fix all line endings, than a patch. That is #1 on your list should be a set of steps. #2 and 3 would be nice as well.

Show
Alex Miller added a comment - I just mean that any patch for this ticket is doomed to be continuously broken until it is applied, so it would be better to instead have a set of steps that someone can follow to fix all line endings, than a patch. That is #1 on your list should be a set of steps. #2 and 3 would be nice as well.
Hide
Andy Fingerhut added a comment -

clj-1026-bash-script.sh is not a patch, but a bash script that can be run in an unmodified Clojure source tree, preferably freshly cloned, that will delete carriage returns characters from all files except those in the .git directory, and directories.

It also prints out the results of a few commands to help you gain confidence that those are the only changes it made.

Tested on Mac OS X 10.8.4 and Ubuntu Linux 12.04.3.

Show
Andy Fingerhut added a comment - clj-1026-bash-script.sh is not a patch, but a bash script that can be run in an unmodified Clojure source tree, preferably freshly cloned, that will delete carriage returns characters from all files except those in the .git directory, and directories. It also prints out the results of a few commands to help you gain confidence that those are the only changes it made. Tested on Mac OS X 10.8.4 and Ubuntu Linux 12.04.3.
Andy Fingerhut made changes -
Attachment clj-1026-bash-script.sh [ 12235 ]
Andy Fingerhut made changes -
Description While examining some of the Clojure source code, I discovered that some files had mixed line endings, or CRLF line endings on a non-Windows box. Using {{.gitattributes}}, we can correct that so that files have the right endings for the platform that it's on. Some source files in Clojure have the DOS/Windows carriage return/line feed line endings, whereas the majority use the Linux/Unix line feed only line endings.

The main ongoing issue with this is that if there is a *correctly formatted* git patch for such a file, then using the recommended 'git am' command to apply such a patch *always gives warnings*. Some screeners see such warnings, and immediately disregard the patch, asking for the warnings to be fixed before they will consider it. As far as I can tell, *this is impossible*. Thus patches modifying files with CR/LF line endings are nearly in a deadlock state, requiring unusual effort from non-screeners to remind screeners that such patches are as good as they can be, and would they please ignore those warnings because there is no way around them.

If someone can find a way to apply git patches to such files without such warnings, please update this description or add a comment.

Stu Halloway asks: Wouldn't eliminating all CR/LF endings in Clojure source files invalidate many unapplied patches? My answer: Please do it anyway. Others will step in to fix those patches, if they become broken.

Original description when this ticket was created:

While examining some of the Clojure source code, I discovered that some files had mixed line endings, or CRLF line endings on a non-Windows box. Using {{.gitattributes}}, we can correct that so that files have the right endings for the platform that it's on.
Andy Fingerhut made changes -
Description Some source files in Clojure have the DOS/Windows carriage return/line feed line endings, whereas the majority use the Linux/Unix line feed only line endings.

The main ongoing issue with this is that if there is a *correctly formatted* git patch for such a file, then using the recommended 'git am' command to apply such a patch *always gives warnings*. Some screeners see such warnings, and immediately disregard the patch, asking for the warnings to be fixed before they will consider it. As far as I can tell, *this is impossible*. Thus patches modifying files with CR/LF line endings are nearly in a deadlock state, requiring unusual effort from non-screeners to remind screeners that such patches are as good as they can be, and would they please ignore those warnings because there is no way around them.

If someone can find a way to apply git patches to such files without such warnings, please update this description or add a comment.

Stu Halloway asks: Wouldn't eliminating all CR/LF endings in Clojure source files invalidate many unapplied patches? My answer: Please do it anyway. Others will step in to fix those patches, if they become broken.

Original description when this ticket was created:

While examining some of the Clojure source code, I discovered that some files had mixed line endings, or CRLF line endings on a non-Windows box. Using {{.gitattributes}}, we can correct that so that files have the right endings for the platform that it's on.
Some source files in Clojure have the DOS/Windows carriage return/line feed line endings, whereas the majority use the Linux/Unix line feed only line endings.

The main ongoing issue with this is that if there is a *correctly formatted* git patch for such a file, then using the recommended 'git am' command to apply such a patch *always gives warnings*. Some screeners see such warnings, and immediately disregard the patch, asking for the warnings to be fixed before they will consider it. As far as I can tell, *this is impossible*. Thus patches modifying files with CR/LF line endings are nearly in a deadlock state, requiring unusual effort from non-screeners to remind screeners that such patches are as good as they can be, and would they please ignore those warnings because there is no way around them.

If someone can find a way to apply git patches to such files without such warnings, please update this description or add a comment.

Stu Halloway asks: Wouldn't eliminating all CR/LF endings in Clojure source files invalidate many unapplied patches? My answer: Please do it anyway. Others will step in to fix those patches, if they become broken.

The attached file clj-1026-bash-script.sh does several things, which should be easy to verify by reading through the script itself, which is quite simple:

1. find all files with CR/LF endings, and remove the CR characters.
2. Does 'git status .' to show the files that have been modified
3. Does 'git diff .' and 'git diff -w .', showing the number of lines that have been modified, and the number of lines that have been modified in a way that changes something besides whitespace, which should be exactly 0 lines, because only whitespace has been changed.


Original description when this ticket was created:

While examining some of the Clojure source code, I discovered that some files had mixed line endings, or CRLF line endings on a non-Windows box. Using {{.gitattributes}}, we can correct that so that files have the right endings for the platform that it's on.
Hide
Andy Fingerhut added a comment -

Shell script clj-1026-bash-script.sh has updated 'comments' printed with the latest stats of number of files and lines containing carriage return characters as of Aug 31 2014.

Show
Andy Fingerhut added a comment - Shell script clj-1026-bash-script.sh has updated 'comments' printed with the latest stats of number of files and lines containing carriage return characters as of Aug 31 2014.
Andy Fingerhut made changes -
Attachment clj-1026-bash-script.sh [ 13299 ]
Andy Fingerhut made changes -
Attachment clj-1026-bash-script.sh [ 12235 ]

People

Vote (3)
Watch (1)

Dates

  • Created:
    Updated: