Skip to content

Commit 5c23597

Browse files
authored
Cleaning up the README field (#2379)
1 parent 9d3deb7 commit 5c23597

File tree

1 file changed

+26
-58
lines changed

1 file changed

+26
-58
lines changed

README.md

+26-58
Original file line numberDiff line numberDiff line change
@@ -2,68 +2,61 @@
22
Elasticsearch real-time search and analytics natively integrated with Hadoop.
33
Supports [Map/Reduce](#mapreduce), [Apache Hive](#apache-hive), and [Apache Spark](#apache-spark).
44

5-
See [project page](http://www.elastic.co/products/hadoop/) and [documentation](http://www.elastic.co/guide/en/elasticsearch/hadoop/current/index.html) for detailed information.
5+
See [project page](https://www.elastic.co/elasticsearch/hadoop/) and [documentation](http://www.elastic.co/guide/en/elasticsearch/hadoop/current/index.html) for detailed information.
66

77
## Requirements
8-
Elasticsearch (__1.x__ or higher (2.x _highly_ recommended)) cluster accessible through [REST][]. That's it!
9-
Significant effort has been invested to create a small, dependency-free, self-contained jar that can be downloaded and put to use without any dependencies. Simply make it available to your job classpath and you're set.
8+
Elasticsearch cluster accessible through [REST][]. That's it!
9+
Significant effort has been invested to create a small, dependency-free, self-contained jar that can be downloaded andput to use without any dependencies. Simply make it available to your job classpath and you're set.
1010
For a certain library, see the dedicated [chapter](http://www.elastic.co/guide/en/elasticsearch/hadoop/current/requirements.html).
1111

12-
ES-Hadoop 6.x and higher are compatible with Elasticsearch __1.X__, __2.X__, __5.X__, and __6.X__
13-
14-
ES-Hadoop 5.x and higher are compatible with Elasticsearch __1.X__, __2.X__ and __5.X__
15-
16-
ES-Hadoop 2.2.x and higher are compatible with Elasticsearch __1.X__ and __2.X__
17-
18-
ES-Hadoop 2.0.x and 2.1.x are compatible with Elasticsearch __1.X__ *only*
12+
While an effort has been made to keep ES-Hadoop backwards compatible with older versions of Elasticsearch, it is best
13+
to use the version of ES-Hadoop that is the same as the Elasticsearch version. See the
14+
[product compatibility support matrix](https://www.elastic.co/support/matrix#matrix_compatibility) for more information.
1915

2016
## Installation
2117

22-
### Stable Release (currently `8.15.1`)
23-
Available through any Maven-compatible tool:
18+
### Stable Release (`9.0.0` used in the examples below)
19+
Support for Hadoop is available through any Maven-compatible tool:
2420

2521
```xml
2622
<dependency>
2723
<groupId>org.elasticsearch</groupId>
2824
<artifactId>elasticsearch-hadoop</artifactId>
29-
<version>8.15.1</version>
25+
<version>9.0.0</version>
3026
</dependency>
3127
```
3228
or as a stand-alone [ZIP](http://www.elastic.co/downloads/hadoop).
3329

34-
### Development Snapshot
35-
Grab the latest nightly build from the [repository](http://oss.sonatype.org/content/repositories/snapshots/org/elasticsearch/elasticsearch-hadoop/) again through Maven:
36-
30+
Spark support depends on the versions of Spark and Scala your cluster uses. For Scala 2.12 and Spark 3.0, 3.1, 3.2, 3.3, or 3.4, use:
3731
```xml
3832
<dependency>
3933
<groupId>org.elasticsearch</groupId>
40-
<artifactId>elasticsearch-hadoop</artifactId>
41-
<version>9.1.0-SNAPSHOT</version>
34+
<artifactId>elasticsearch-spark-30_2.12</artifactId>
35+
<version>9.0.0</version>
4236
</dependency>
4337
```
44-
38+
For Scala 2.13 and Spark 3.2, 3.3, or 3.4, use:
4539
```xml
46-
<repositories>
47-
<repository>
48-
<id>sonatype-oss</id>
49-
<url>http://oss.sonatype.org/content/repositories/snapshots</url>
50-
<snapshots><enabled>true</enabled></snapshots>
51-
</repository>
52-
</repositories>
40+
<dependency>
41+
<groupId>org.elasticsearch</groupId>
42+
<artifactId>elasticsearch-spark-30_2.13</artifactId>
43+
<version>9.0.0</version>
44+
</dependency>
5345
```
5446

55-
or [build](#building-the-source) the project yourself.
56-
57-
We do build and test the code on _each_ commit.
5847

5948
### Supported Hadoop Versions
6049

61-
Running against Hadoop 1.x is deprecated in 5.5 and will no longer be tested against in 6.0.
62-
ES-Hadoop is developed for and tested against Hadoop 2.x and YARN.
50+
ES-Hadoop is developed for and tested against Hadoop 2.x and 3.x on YARN.
6351
More information in this [section](http://www.elastic.co/guide/en/elasticsearch/hadoop/current/install.html).
6452

53+
### Supported Spark Versions
54+
55+
Spark 3.0 through 3.4 are supported. Only Scala 2.12 is supported for Spark 3.0 and 3.1. Both Scala 2.12 and 2.13
56+
are supported for Spark 3.2 and higher.
57+
6558
## Feedback / Q&A
66-
We're interested in your feedback! You can find us on the User [mailing list](https://groups.google.com/forum/?fromgroups#!forum/elasticsearch) - please append `[Hadoop]` to the post subject to filter it out. For more details, see the [community](http://www.elastic.co/community) page.
59+
We're interested in your feedback! You can find us on the [Elastic forum](https://discuss.elastic.co/).
6760

6861

6962
## Online Documentation
@@ -96,30 +89,7 @@ For basic, low-level or performance-sensitive environments, ES-Hadoop provides d
9689
(either by bundling the library along - it's ~300kB and there are no-dependencies), using the [DistributedCache][] or by provisioning the cluster manually.
9790
See the [documentation](http://www.elastic.co/guide/en/elasticsearch/hadoop/current/index.html) for more information.
9891

99-
Note that es-hadoop supports both the so-called 'old' and the 'new' API through its `EsInputFormat` and `EsOutputFormat` classes.
100-
101-
### 'Old' (`org.apache.hadoop.mapred`) API
102-
103-
### Reading
104-
To read data from ES, configure the `EsInputFormat` on your job configuration along with the relevant [properties](#configuration-properties):
105-
```java
106-
JobConf conf = new JobConf();
107-
conf.setInputFormat(EsInputFormat.class);
108-
conf.set("es.resource", "radio/artists");
109-
conf.set("es.query", "?q=me*"); // replace this with the relevant query
110-
...
111-
JobClient.runJob(conf);
112-
```
113-
### Writing
114-
Same configuration template can be used for writing but using `EsOuputFormat`:
115-
```java
116-
JobConf conf = new JobConf();
117-
conf.setOutputFormat(EsOutputFormat.class);
118-
conf.set("es.resource", "radio/artists"); // index or indices used for storing data
119-
...
120-
JobClient.runJob(conf);
121-
```
122-
### 'New' (`org.apache.hadoop.mapreduce`) API
92+
Note that es-hadoop supports the Hadoop API through its `EsInputFormat` and `EsOutputFormat` classes.
12393

12494
### Reading
12595
```java
@@ -187,8 +157,6 @@ As one can note, currently the reading and writing are treated separately but we
187157
## [Apache Spark][]
188158
ES-Hadoop provides native (Java and Scala) integration with Spark: for reading a dedicated `RDD` and for writing, methods that work on any `RDD`. Spark SQL is also supported
189159

190-
### Scala
191-
192160
### Reading
193161
To read data from ES, create a dedicated `RDD` and specify the query as an argument:
194162

0 commit comments

Comments
 (0)