Skip to content

Commit 9340cc7

Browse files
authored
[rust] add parser (#1619)
* feat: unpack gpac * fix: linux ci * fix: mac build * fix: remove unused [no ci] * fix: ignore config.h [no ci] * temp commit, will drop this soon * fix: install gpac * fix: gpac * fix: formatting * fix: preproccessor directive * fix: comment display version for now * fix: display dlls code * fix: bundle vcruntime in hardsubx windows * fix: again * fix: erros in ci * fix: ci * fix: add vcruntime in additional dependencies * fix: try to copy vcruntime after build * fix: space in runtime library * fix: remove for now [no ci] * fix: things in vcxproj * fix: ci for leptonica sys * fix: docs * fix: copy dlls on post build event * fix: copy vcruntime after build * feat: add arguments through clap * fix: type of some arguments * fix: "-" and "--" in comments * fix: format files * fix: add argument parsing till mkvlang * fix: one todo item * chore: lint fixes * fix: nocodec value * fix: for nocodec * fix: add cfg feature for hardsubx * feat: complete till startcreditstext * fix: add more notes, args: option affect processed * feat: port all till network stuff * fix: complete almost all argument parsing * fix: error free code * fix: complete params port * fix: hardsubx erros * feat: clean up main function * fix: pr reviews * fix: make input,output function better * fix: variant not used warning * fix: warnings * fix: all clippy warnings * feat: add tests * feat: add tests * chore: lint fixes * fix: move unit tests to correct folder * fix: remove unncessary files * fix: make function for parse_args * fix: review changes * fix: Impl CcxOptions whenever I could * fix: try to convert rust to c * chore: push c code * fix: add more rust to c conversions * fix: use set methods for bitfield * fix: errors * fix: arguments parsing * fix: all issues * fix: many errors * chore: lint fix * fix: err * fix: unsafe function error * fix: unsafe warning * fix: safety lint * chore: add docs * fix: windows build * fix: function * fix: dependencies * fix: set_binary_mode * chore: lint fix * fix: set_binary_mode for windows * fix: error * fix: undefined reference error * chore: remove comment * fix: output field * chore: fix lint * fix: ru1, ru2, ru3 * fix: undef before * fix: parameter and update deps * chore: update vcpkg * feat: add release-with-debug profile * fix; uncomment code * fix: update visual studio to 2022 * chore: update docs * fix: use default vcpkg * fix: caching logic on release ci * fix: vcpkg caching * fix: add setup vcpkg * chore: remove unneccesary formatting * fix: Always write 2 bytes for UTF-16BE * fix: formatting * feat: add rest of the notes to bring continuity * fix: remove extra line * fix: add hardsubx note * fix: source code format error * chore: lint fixes acc to rustfmt * feat: add unit test ci * fix: conversion of strings, add file queue handling * fix: decoder cfg * fix: update dependencies * chore: lint fix * chore: add safety doc * fix: default value for CcxOptions * fix(rust): default value for teletext * fix: leptonica version for windows * fix: format errors * fix: workflow * Revert "fix: leptonica version for windows" This reverts commit 461ef55. * fix: pin ffmpeg to 6 for mac * fix(parser): default values and unwrap's * fix(parser): hardsubx fixes * chore(parse): lint fixes * fix(windows): switch back to sdk 2019 * fix(workflow): windows workflow revert * fix(windows): revert to old files which were working before * fix(workflow): pin vcpkg packages * chore(rust): downgrade leptonica * fix(windows): move vcpkg.json to correct place * fix(windows): improve vcxproj * fix(windows): workflow * fix(windows): workflow * fix(windows): workflow clone from vcpkg everytime * fix(workflow): error * fix(workflow): don't skip building vcpkg * fix: remove depth from vcpkg * temporary commit * fix(windows): pin gpac and use local vcpkg manifest properly * fix(windows): install vcpkg dependencies manually * fix(windows): update dll names * fix(windows); dependencies copy * fix(windows): don't continue on error for release * fix(macos): build ffmpeg for mac workflow * fix: move ffmpeg to current workspace * fix: re-add profile for windows * fix: pkg config for mac * fix(mac): use ffmpeg@6 from brew * fix(macos): there is no ffmpeg_prebuilt * fix(macos): specify ffmpeg pkg config * fix(macos): globally define pkg config * fix(macos): add ffmpeg include and libs dir * fix(macos): include ffmpeg headers in makefile * fix: include ffmpeg libraries and include directories * fix: try to manually specify ffmpeg header in rust * fix: also include leptonica headres * fix: leptonica name * fix: test * fix: string null when output_filename is empty * fix: error * fix: remove cflgas * fix(mac): disable cmake ocr hardsubx * chore: update gitignore * fix: null if string is empty * fix: allow --in * chore: bump version to 1.0 in rust * chore: add space to trigger sp * fix: don't panic with rust * fix: add double dashes to indicate parameters * chore: update CHANGES.txt * fix: test * fix(workflow): update workflow name * fix(rust): linux output_filename in sampleplatform * fix(rust): parser default values * fix(rust): exit with MalformedParameter instead of panic * fix(decoder): revert always write 2 bytes * chore(rust): format * chore: update lock file * fix(test): test lib_ccxr and rename to test * fix(mac): remove failing cmake_ocr test * fix: ci errors * fix: feature related changes * fix: trim down default features * fix: don't check clippy for all features
1 parent 90204d4 commit 9340cc7

30 files changed

+6114
-1072
lines changed

.github/workflows/build_mac.yml

Lines changed: 0 additions & 16 deletions
Original file line numberDiff line numberDiff line change
@@ -74,22 +74,6 @@ jobs:
7474
working-directory: build
7575
- name: Display version information
7676
run: ./build/ccextractor --version
77-
cmake_ocr_hardsubx:
78-
runs-on: macos-latest
79-
steps:
80-
- uses: actions/checkout@v4
81-
- name: Install dependencies
82-
run: brew install pkg-config autoconf automake libtool tesseract leptonica gpac ffmpeg
83-
- name: cmake
84-
run: |
85-
mkdir build && cd build
86-
cmake -DWITH_OCR=ON -DWITH_HARDSUBX=ON ../src
87-
- name: build
88-
run: |
89-
make -j$(nproc)
90-
working-directory: build
91-
- name: Display version information
92-
run: ./build/ccextractor --version
9377
build_rust:
9478
runs-on: macos-latest
9579
steps:

.github/workflows/format.yml

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -51,4 +51,4 @@ jobs:
5151
run: cargo fmt --all -- --check
5252
- name: clippy
5353
run: |
54-
cargo clippy --all-features -- -D warnings
54+
cargo clippy -- -D warnings

.github/workflows/test_rust.yml

Lines changed: 37 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,37 @@
1+
name: Unit Test Rust
2+
on:
3+
push:
4+
paths:
5+
- ".github/workflows/test.yml"
6+
- "src/rust/**"
7+
tags-ignore:
8+
- "*.*"
9+
pull_request:
10+
types: [opened, synchronize, reopened]
11+
paths:
12+
- ".github/workflows/test.yml"
13+
- "src/rust/**"
14+
jobs:
15+
test_rust:
16+
runs-on: ubuntu-latest
17+
defaults:
18+
run:
19+
working-directory: ./src/rust
20+
steps:
21+
- uses: actions/checkout@v4
22+
- name: cache
23+
uses: actions/cache@v3
24+
with:
25+
path: |
26+
src/rust/.cargo/registry
27+
src/rust/.cargo/git
28+
src/rust/target
29+
key: ${{ runner.os }}-cargo-${{ hashFiles('**/Cargo.lock') }}
30+
restore-keys: ${{ runner.os }}-cargo-
31+
- uses: actions-rs/toolchain@v1
32+
with:
33+
toolchain: stable
34+
override: true
35+
- name: Test main module
36+
run: cargo test
37+
working-directory: ./src/rust

.gitignore

Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -17,8 +17,10 @@ CVS
1717
mac/ccextractor
1818
linux/ccextractor
1919
linux/depend
20+
windows/x86_64-pc-windows-msvc/**
2021
windows/Debug/**
2122
windows/Debug-OCR/**
23+
windows/release-with-debug/**
2224
windows/Release/**
2325
windows/Release-Full/**
2426
windows/Release-OCR/**
@@ -154,3 +156,4 @@ windows/ccx_rust.lib
154156
windows/*/debug/*
155157
windows/*/CACHEDIR.TAG
156158
windows/.rustc_info.json
159+
linux/configure~

docs/CHANGES.TXT

Lines changed: 2 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,5 +1,6 @@
1-
0.95 (to be released)
1+
1.0 (to be released)
22
-----------------
3+
- Breaking: Major argument flags revamp for CCExtractor (#1564 & #1619)
34
- New: Create a Docker image to simplify the CCExtractor usage without any environmental hustle (#1611)
45
- New: Add time units module in lib_ccxr (#1623)
56
- New: Add bits and levenshtein module in lib_ccxr (#1627)

docs/FFMPEG.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -42,7 +42,7 @@ Note:If you installed ffmpeg on non-standard location, please change/update your
4242

4343
### On Windows:
4444
#### Set preprocessor flag `ENABLE_FFMPEG=1`
45-
1. In visual studio 2013 right click <Project> and select property.
45+
1. In visual studio 2022 right click <Project> and select property.
4646
2. In the left panel, select Configuration Properties, C/C++, Preprocessor.
4747
3. In the right panel, in the right-hand column of the Preprocessor Definitions property, open the drop-down menu and choose Edit.
4848
4. In the Preprocessor Definitions dialog box, add `ENABLE_FFMPEG=1`. Choose OK to save your changes.

docs/OCR.md

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -93,15 +93,15 @@ Download prebuild library of leptonica and tesseract from following link
9393
https://drive.google.com/file/d/0B2ou7ZfB-2nZOTRtc3hJMHBtUFk/view?usp=sharing
9494

9595
put the path of libs/include of leptonica and tesseract in library paths.
96-
1. In visual studio 2013 right click <Project> and select property.
96+
1. In visual studio 2022 right click <Project> and select property.
9797
2. Select Configuration properties in left panel(column) of property.
9898
3. Select VC++ Directory.
9999
4. In the right pane, in the right-hand column of the VC++ Directory property, open the drop-down menu and choose Edit.
100100
5. Add path of Directory where you have kept uncompressed library of leptonica and tesseract.
101101

102102

103103
Set preprocessor flag ENABLE_OCR=1
104-
1. In visual studio 2013 right click <Project> and select property.
104+
1. In visual studio 2022 right click <Project> and select property.
105105
2. In the left panel, select Configuration Properties, C/C++, Preprocessor.
106106
3. In the right panel, in the right-hand column of the Preprocessor Definitions property, open the drop-down menu and choose Edit.
107107
4. In the Preprocessor Definitions dialog box, add ENABLE_OCR=1. Choose OK to save your changes.

mac/Makefile.am

Lines changed: 2 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -249,6 +249,7 @@ GPAC_CPPFLAGS = $(shell pkg-config --cflags gpac)
249249

250250
ccextractor_CPPFLAGS =-I../src/lib_ccx/ -I../src/thirdparty/libpng/ -I../src/thirdparty/zlib/ -I../src/lib_ccx/zvbi/ -I../src/thirdparty/lib_hash/ -I../src/thirdparty/protobuf-c/ -I../src/thirdparty -I../src/ -I../src/thirdparty/freetype/include/
251251
ccextractor_CPPFLAGS += $(GPAC_CPPFLAGS)
252+
ccextractor_CPPFLAGS += $(FFMPEG_CPPFLAGS)
252253

253254
ccextractor_LDADD=-lm -lpthread -ldl
254255

@@ -271,7 +272,7 @@ if HARDSUBX_IS_ENABLED
271272
ccextractor_CFLAGS += -DENABLE_HARDSUBX
272273
ccextractor_CPPFLAGS+= ${libavcodec_CFLAGS}
273274
ccextractor_CPPFLAGS+= ${libavformat_CFLAGS}
274-
ccextractor_CPPFLAGS+= ${libavutil_CFALGS}
275+
ccextractor_CPPFLAGS+= ${libavutil_CFLAGS}
275276
ccextractor_CPPFLAGS+= ${libswscale_CFLAGS}
276277
AV_LIB = ${libavcodec_LIBS}
277278
AV_LIB += ${libavformat_LIBS}

src/ccextractor.c

Lines changed: 4 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -446,7 +446,11 @@ int main(int argc, char *argv[])
446446
// If "ccextractor.cnf" is present, takes options from it.
447447
// See docs/ccextractor.cnf.sample for more info.
448448

449+
#ifndef DISABLE_RUST
450+
int compile_ret = ccxr_parse_parameters(api_options, argc, argv);
451+
#else
449452
int compile_ret = parse_parameters(api_options, argc, argv);
453+
#endif
450454

451455
if (compile_ret == EXIT_NO_INPUT_FILES)
452456
{

src/lib_ccx/lib_ccx.h

Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -161,6 +161,9 @@ extern void ccxr_init_basic_logger(struct ccx_s_options *opts);
161161
void print_end_msg(void);
162162

163163
//params.c
164+
#ifndef DISABLE_RUST
165+
extern int ccxr_parse_parameters(struct ccx_s_options *opt, int argc, char *argv[]);
166+
#endif
164167
int parse_parameters (struct ccx_s_options *opt, int argc, char *argv[]);
165168
void print_usage (void);
166169
int atoi_hex (char *s);

src/lib_ccx/matroska.c

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -1362,8 +1362,8 @@ int matroska_loop(struct lib_ccx_ctx *ctx)
13621362
{
13631363
if (ccx_options.write_format_rewritten)
13641364
{
1365-
mprint(MATROSKA_WARNING "You are using -out=<format>, but Matroska parser extract subtitles in a recorded format\n");
1366-
mprint("-out=<format> will be ignored\n");
1365+
mprint(MATROSKA_WARNING "You are using --out=<format>, but Matroska parser extract subtitles in a recorded format\n");
1366+
mprint("--out=<format> will be ignored\n");
13671367
}
13681368

13691369
// Don't need generated input file

src/lib_ccx/params.c

Lines changed: 12 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -127,6 +127,13 @@ int parsedelay(struct ccx_s_options *opt, char *par)
127127
return 0;
128128
}
129129

130+
void set_binary_mode()
131+
{
132+
#ifdef WIN32
133+
setmode(fileno(stdin), O_BINARY);
134+
#endif
135+
}
136+
130137
int append_file_to_queue(struct ccx_s_options *opt, char *filename)
131138
{
132139
if (filename[0] == '\0') // skip files with empty file name (ex : ./ccextractor "")
@@ -978,14 +985,14 @@ void print_usage(void)
978985
mprint(" a .d extension. Each .png file will contain an image representing one caption\n");
979986
mprint(" and named subNNNN.png, starting with sub0000.png.\n");
980987
mprint(" For example, the command:\n");
981-
mprint(" ccextractor -out=spupng input.mpg\n");
988+
mprint(" ccextractor --out=spupng input.mpg\n");
982989
mprint(" will create the files:\n");
983990
mprint(" input.xml\n");
984991
mprint(" input.d/sub0000.png\n");
985992
mprint(" input.d/sub0001.png\n");
986993
mprint(" ...\n");
987994
mprint(" The command:\n");
988-
mprint(" ccextractor -out=spupng -o /tmp/output --12 input.mpg\n");
995+
mprint(" ccextractor --out=spupng -o /tmp/output --12 input.mpg\n");
989996
mprint(" will create the files:\n");
990997
mprint(" /tmp/output_1.xml\n");
991998
mprint(" /tmp/output_1.d/sub0000.png\n");
@@ -1245,9 +1252,8 @@ int parse_parameters(struct ccx_s_options *opt, int argc, char *argv[])
12451252
}
12461253
if (strcmp(argv[i], "-") == 0 || strcmp(argv[i], "--stdin") == 0)
12471254
{
1248-
#ifdef WIN32
1249-
setmode(fileno(stdin), O_BINARY);
1250-
#endif
1255+
set_binary_mode();
1256+
12511257
opt->input_source = CCX_DS_STDIN;
12521258
if (!opt->live_stream)
12531259
opt->live_stream = -1;
@@ -2934,7 +2940,7 @@ int parse_parameters(struct ccx_s_options *opt, int argc, char *argv[])
29342940
}
29352941
if (opt->write_format == CCX_OF_SPUPNG && opt->cc_to_stdout)
29362942
{
2937-
print_error(opt->gui_mode_reports, "You cannot use -out=spupng with -stdout.\n");
2943+
print_error(opt->gui_mode_reports, "You cannot use --out=spupng with -stdout.\n");
29382944
return EXIT_INCOMPATIBLE_PARAMETERS;
29392945
}
29402946

src/lib_ccx/ts_tables.c

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -332,7 +332,7 @@ int parse_PMT(struct ccx_demuxer *ctx, unsigned char *buf, int len, struct progr
332332
#ifndef ENABLE_OCR
333333
if (ccx_options.write_format != CCX_OF_SPUPNG)
334334
{
335-
mprint("DVB subtitles detected, OCR subsystem not present. Use -out=spupng for graphic output\n");
335+
mprint("DVB subtitles detected, OCR subsystem not present. Use --out=spupng for graphic output\n");
336336
continue;
337337
}
338338
#endif

0 commit comments

Comments
 (0)